[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-25 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179628#comment-16179628
 ] 

Vinod Kone commented on MESOS-7975:
---

cc [~bmahler]

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-21 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175836#comment-16175836
 ] 

Qian Zhang commented on MESOS-7975:
---

[~alexr] I have sent a mail to the lists just now, let's wait for the feedback 
from the community.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-21 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175793#comment-16175793
 ] 

Qian Zhang commented on MESOS-7975:
---

[~jpe...@apache.org] When the scheduler sends a kill, will your executor send a 
SIGTERM to the task or SIGKILL? If it is SIGTERM, and the task handles it 
gracefully and exit with 0, do you think it is reasonable for executor to send 
a TASK_FINISHED in this case?

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-21 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175135#comment-16175135
 ] 

James Peach commented on MESOS-7975:


FWIW the rule we have in our executor is that if we terminated a task because 
the scheduler send a kill, we always send a {{TASK_KILLED}} status. That is the 
only reason we send this status.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-21 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174611#comment-16174611
 ] 

Alexander Rukletsov commented on MESOS-7975:


[~qianzhang] I think we should send an email to the lists. I understand that 
this might seem like a lot of work for "an easy fix", but it is an important 
change even though it requires small code change.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-17 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169475#comment-16169475
 ] 

Qian Zhang commented on MESOS-7975:
---

[~alexr] Yeah, in mesos.proto, for {{TASK_FINISHED}}, I only see a comment {{// 
TERMINAL: The task finished successfully}} which seems not very clear, 
different people may have different understanding for that. We may need to send 
a mail to dev & user list to let everyone know our proposal and collect 
feedbacks, and eventually reach a consensus.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-15 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167567#comment-16167567
 ] 

Alexander Rukletsov commented on MESOS-7975:


[~qianzhang]: exactly. I can't find any written reference, that 
{{TASK_FINISHED}} means "finished on its own", i.e., not in response to a 
signal. Can we clarify the meaning of {{TASK_FINISHED}} first, maybe even 
document the contract?

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-14 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167234#comment-16167234
 ] 

Qian Zhang commented on MESOS-7975:
---

I think it depends on how we define the meaning of {{TASK_FINISHED}}, if it 
means the task is terminated successfully *on its own without external 
interference* (as Anand said), then I think it does not make sense for 
scheduler to receive a {{TASK_KILLING}} followed by a {{TASK_FINISHED}} since 
there is indeed an external interference (killing task is initiated by 
scheduler). However, if {{TASK_FINISHED}} means the task is terminated 
successfully for whatever reason, then I think it is OK to receive a 
{{TASK_KILLING}} followed by a {{TASK_FINISHED}}.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-14 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166656#comment-16166656
 ] 

Anand Mazumdar commented on MESOS-7975:
---

hmm, Let's start with why the task was being killed in the first place. It was 
due to the scheduler initiating the kill. As part of the underlying kill policy 
associated with the task, the executor signals its intent to kill the task via 
the {{TASK_KILLING}} update. Once, the task terminates gracefully, the terminal 
status update should not be dependent on the exit code it exited with 
(TASK_FINISHED vs TASK_KILLED). It should always be {{TASK_KILLED}}

The task could have exited with a non-zero status code when handling the 
{{SIGTERM}} itself. So, I don't see the motive of how a scheduler can use this 
extra information to be sure that it was due to the {{SIGKILL}} signal as you 
alluded to. 

The message we want to convey to the scheduler is that their task died due to 
them initiating the kill operation and the terminal status update should 
reflect that. The thing specifically weird currently is:

- A task exits with a zero status code after the scheduler initiated the kill. 
The scheduler receives a {{TASK_KILLING}} update followed by a 
{{TASK_FINISHED}} update. A {{TASK_FINISHED}} means that the task terminated 
successfully on its own without external interference. However, here the 
executor executed the {{KillPolicy}} associated with the task and explicitly 
killed it.

- A task exits with a non zero status code. The scheduler receives a 
{{TASK_KILLING}} followed by {{TASK_KILLED}} status update. This is 
in-consistent with the above.

The proposed fix is to correctly send {{TASK_KILLING}} followed by 
{{TASK_KILLED}} i.e., the intent to kill the task is followed by the explicit 
terminal status update that the task has been killed. 

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166547#comment-16166547
 ] 

Alexander Rukletsov commented on MESOS-7975:


This is not 100% bug. It is a philosophical question, whether a task that 
terminates cleanly with zero exit code should be considered killed: we asked 
the task to terminate, but we did not SIGKILL it.

Moreover, we actually changed the behaviour of the docker executor to match the 
behaviour of the command executor. See MESOS-4279 and [this 
review|https://reviews.apache.org/r/48428/], especially, 
[this|https://issues.apache.org/jira/browse/MESOS-4279?focusedCommentId=15249489=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15249489],
 
[this|https://issues.apache.org/jira/browse/MESOS-4279?focusedCommentId=15096389=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15096389],
 and 
[this|https://issues.apache.org/jira/browse/MESOS-4279?focusedCommentId=15243232=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15243232]
 comments.

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7975) The command/default executor can incorrectly send a TASK_FINISHED update even when the task is killed

2017-09-14 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166060#comment-16166060
 ] 

Qian Zhang commented on MESOS-7975:
---

RR:
https://reviews.apache.org/r/62326/
https://reviews.apache.org/r/62327/

> The command/default executor can incorrectly send a TASK_FINISHED update even 
> when the task is killed
> -
>
> Key: MESOS-7975
> URL: https://issues.apache.org/jira/browse/MESOS-7975
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>Priority: Critical
>  Labels: mesosphere
>
> Currently, when a task is killed, the default and the command executor 
> incorrectly send a {{TASK_FINISHED}} status update instead of 
> {{TASK_KILLED}}. This is due to an unfortunate missed conditional check when 
> the task exits with a zero status code.
> {code}
>   if (WSUCCEEDED(status)) {
> taskState = TASK_FINISHED;
>   } else if (killed) {
> // Send TASK_KILLED if the task was killed as a result of
> // kill() or shutdown().
> taskState = TASK_KILLED;
>   } else {
> taskState = TASK_FAILED;
>   }
> {code}
> We should modify the code to correctly send {{TASK_KILLED}} status updates 
> when a task is killed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)