[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166656#comment-16166656
 ] 

Anand Mazumdar commented on MESOS-7975:
---------------------------------------

hmm, Let's start with why the task was being killed in the first place. It was 
due to the scheduler initiating the kill. As part of the underlying kill policy 
associated with the task, the executor signals its intent to kill the task via 
the {{TASK_KILLING}} update. Once, the task terminates gracefully, the terminal 
status update should not be dependent on the exit code it exited with 
(TASK_FINISHED vs TASK_KILLED). It should always be {{TASK_KILLED}}

The task could have exited with a non-zero status code when handling the 
{{SIGTERM}} itself. So, I don't see the motive of how a scheduler can use this 
extra information to be sure that it was due to the {{SIGKILL}} signal as you 
alluded to. 

The message we want to convey to the scheduler is that their task died due to 
them initiating the kill operation and the terminal status update should 
reflect that. The thing specifically weird currently is:

- A task exits with a zero status code after the scheduler initiated the kill. 
The scheduler receives a {{TASK_KILLING}} update followed by a 
{{TASK_FINISHED}} update. A {{TASK_FINISHED}} means that the task terminated 
successfully on its own without external interference. However, here the 
executor executed the {{KillPolicy}} associated with the task and explicitly 
killed it.

- A task exits with a non zero status code. The scheduler receives a 
{{TASK_KILLING}} followed by {{TASK_KILLED}} status update. This is 
in-consistent with the above.

The proposed fix is to correctly send {{TASK_KILLING}} followed by 
{{TASK_KILLED}} i.e., the intent to kill the task is followed by the explicit 
terminal status update that the task has been killed. 

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-7975
>                 URL: https://issues.apache.org/jira/browse/MESOS-7975
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Anand Mazumdar
>            Assignee: Qian Zhang
>            Priority: Critical
>              Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>       if (WSUCCEEDED(status)) {
>         taskState = TASK_FINISHED;
>       } else if (killed) {
>         // Send TASK_KILLED if the task was killed as a result of
>         // kill() or shutdown().
>         taskState = TASK_KILLED;
>       } else {
>         taskState = TASK_FAILED;
>       }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to