[ https://issues.apache.org/jira/browse/MESOS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962161#comment-16962161 ]
Benjamin Bannier commented on MESOS-9940: ----------------------------------------- Like Meng Zhu wrote, the issue is that master currently immediately transitions all framework tasks and executors to terminal state after sending out \{{ShutdownFrameworkMessage}}s to the agents. The master does not wait for agent responses to confirm that the framework was indeed shut down on all agents. Possible solutions need to introduce some feedback mechanism so the master can make sure the agents have carried out the framework removal: 1. \{{Master::removeFramework}} instructs agents to shut down the framework and transitions all master-owned operations and task launches (these seem to be the ones pending authorization; introduce a new framework state like \{{REMOVING}} and transition the framework state to it (but keep framework around). Whenever a framework reregisters with a framework in \{{REMOVING}} state master would send it a \{{ShutdownFrameworkMessage}} as well. If the master knows about unreachable agents with tasks from the framework it could 2. Either 2a. Introduce an ACK message for \{{ShutdownFrameworkMessage}} and make master wait for it before carrying out final removal and transitions of tasks, executors, and operations. The agent might send this message after it has successfully terminated tasks and executors 2b. Have master wait for terminal executor, tasks and operation updates from the agent; if required the master would acknowledge. This requires modifications to the agent to make sure these updates are sent even though the framework is in \{{TERMINATING}} state on the agent side (e.g., around its \{{TaskStatusUpdateManager}}). Ideally this work would remove some knowlegde information around a framework's subscription status from the agent. This approach seems to introduce additional coupling between agent and master as they need to have a common idea on what constitutes an active vs terminated framework. > Framework removal may lead to inconsistent task states between master and > agent. > -------------------------------------------------------------------------------- > > Key: MESOS-9940 > URL: https://issues.apache.org/jira/browse/MESOS-9940 > Project: Mesos > Issue Type: Bug > Components: master > Reporter: Meng Zhu > Assignee: Benjamin Bannier > Priority: Major > Labels: foundations > > When a framework is removed from the master (say due to disconnection), > master sends a `ShutdownFrameworkMessage` to the agent. At the same time, > master would transition the task status to e.g. KILLED. > (https://github.com/apache/mesos/blob/master/src/master/master.cpp#L11247-L11291) > When agent got the shutdown message, it would try to shutdown all the > executor and destroy all the containers. The tasks' status is updated after > all these are done. > (https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L7914-L7922) > However, if the executor shutdown gets stuck (e.g. due to hanging docker > daemon), the task status transition will never happen. And master and agent > will have diverged view of these tasks. > One consequence is that masters may try to schedule more workloads onto the > problematic agent (because it thinks those task resources are freed up). > Since we do not have overcommit check on agent, agent will comply and launch > those tasks. This will lead to over-allocation. > One possible solution is to hold on the master status update until the agent > is done with the framework shutdown. -- This message was sent by Atlassian Jira (v8.3.4#803005)