Kirill Tkalenko created IGNITE-16916:
----------------------------------------

             Summary: Make nodes more resilient in case of a job cancellation
                 Key: IGNITE-16916
                 URL: https://issues.apache.org/jira/browse/IGNITE-16916
             Project: Ignite
          Issue Type: Task
          Components: compute
            Reporter: Kirill Tkalenko
            Assignee: Kirill Tkalenko
             Fix For: 2.14


In case of a job being cancelled we currently have a really questionable 
approach.

We are now setting the interruption flag even before we give a use a chance to 
stop the job gracefully.

Proposal for the implementation:
* Adding a distributed property in the metastore that will set a timeout for 
interrupting *GridJobWorker* that did not gracefully complete after calling 
*GridJobWorker#cancel*;
* On the call of the *GridJobWorker#cancel*, do not *Thread#interrupt* the 
thread, but add *GridTimeoutObject*.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to