[
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166656#comment-16166656
]
Anand Mazumdar commented on MESOS-7975:
---------------------------------------
hmm, Let's start with why the task was being killed in the first place. It was
due to the scheduler initiating the kill. As part of the underlying kill policy
associated with the task, the executor signals its intent to kill the task via
the {{TASK_KILLING}} update. Once, the task terminates gracefully, the terminal
status update should not be dependent on the exit code it exited with
(TASK_FINISHED vs TASK_KILLED). It should always be {{TASK_KILLED}}
The task could have exited with a non-zero status code when handling the
{{SIGTERM}} itself. So, I don't see the motive of how a scheduler can use this
extra information to be sure that it was due to the {{SIGKILL}} signal as you
alluded to.
The message we want to convey to the scheduler is that their task died due to
them initiating the kill operation and the terminal status update should
reflect that. The thing specifically weird currently is:
- A task exits with a zero status code after the scheduler initiated the kill.
The scheduler receives a {{TASK_KILLING}} update followed by a
{{TASK_FINISHED}} update. A {{TASK_FINISHED}} means that the task terminated
successfully on its own without external interference. However, here the
executor executed the {{KillPolicy}} associated with the task and explicitly
killed it.
- A task exits with a non zero status code. The scheduler receives a
{{TASK_KILLING}} followed by {{TASK_KILLED}} status update. This is
in-consistent with the above.
The proposed fix is to correctly send {{TASK_KILLING}} followed by
{{TASK_KILLED}} i.e., the intent to kill the task is followed by the explicit
terminal status update that the task has been killed.
> The command/default executor can incorrectly send a TASK_FINISHED update even
> when the task is killed
> -----------------------------------------------------------------------------------------------------
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
> Issue Type: Bug
> Reporter: Anand Mazumdar
> Assignee: Qian Zhang
> Priority: Critical
> Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor
> incorrectly send a {{TASK_FINISHED}} status update instead of
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when
> the task exits with a zero status code.
> {code}
> if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
> } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
> } else {
> taskState = TASK_FAILED;
> }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates
> when a task is killed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)