[ 
https://issues.apache.org/jira/browse/MESOS-10018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-10018:
----------------------------------------

    Shepherd: Benno Evers
      Sprint: Foundations: RI-19 57
    Assignee: Benjamin Bannier

> Duplicate tasks if agent partitioned during maintenance down period
> -------------------------------------------------------------------
>
>                 Key: MESOS-10018
>                 URL: https://issues.apache.org/jira/browse/MESOS-10018
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Major
>
> When the master starts maintenance for a node it
> (1) sends a {{ShutdownMessage}} message to agent, and
> (2) removes the slave which transitions all tasks to {{TASK_LOST}} and moves 
> them
> to the completed task set.
> If the {{ShutdownMessage}} isn't fully processed on the agent (e.g., message 
> dropped between (1) and (2), or agent process killed before the executor has 
> shut down), the agent could come back with the lost task running. It would 
> report the task on registration with the master, which would add it to the 
> list of active tasks. With that the same task could be both completed and 
> active.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to