[ 
https://issues.apache.org/jira/browse/MESOS-9962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944641#comment-16944641
 ] 

Benjamin Bannier commented on MESOS-9962:
-----------------------------------------

A related issue is the early exit for the case where the framework is not 
connected, 
https://github.com/apache/mesos/blob/f1789b0fe5cad221b79a0bc2adfe2036cce6f33d/src/slave/slave.cpp#L5803-L5810.

> Mesos may report completed task as running in the state.
> --------------------------------------------------------
>
>                 Key: MESOS-9962
>                 URL: https://issues.apache.org/jira/browse/MESOS-9962
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent
>            Reporter: Meng Zhu
>            Assignee: Benjamin Bannier
>            Priority: Major
>              Labels: foundations
>
> When the following steps occur:
> 1) A graceful shutdown is initiated on the agent (i.e. SIGUSR1 or 
> /master/machine/down).
> 2) The executor is sent a kill, and the agent counts down on 
> executor_shutdown_grace_period.
> 3) The executor exits, before all terminal status updates reach the agent. 
> This is more likely if executor_shutdown_grace_period passes.
> This results in a completed executor, with non-terminal tasks (according to 
> status updates).
> This would produce a confusing report where completed tasks are still 
> TASK_RUNNING.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to