----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50268/#review143063 -----------------------------------------------------------
Looks good to me. Some high level thoughts: 1) This works under assumption, that an agent shuts down the framework as long as there are no active executors or executors with pending tasks, i.e. there can be no situation when the framework is present on the agent but no executors are running. AFAIK, this is currently the case, but this may change in the future, for example, when we introduce pods. 2) I wonder, if there can be a case, when a framework's executor does exist on the agent, but the master does not know about it yet. When the agent reregisters, it send all its active executors to the master. However, it does not include executors with pending tasks, see MESOS-1800. It looks like this may lead to orphaned frameworks on agents. Could you pelase check these assumptions are correct and capture them in a comment so debugging and updating the code is easier in the future? src/master/master.cpp (line 6387) <https://reviews.apache.org/r/50268/#comment208767> Period at the end : ). src/master/master.cpp (lines 6391 - 6392) <https://reviews.apache.org/r/50268/#comment208768> Blank line src/slave/slave.cpp (line 2289) <https://reviews.apache.org/r/50268/#comment208769> With this patch, this should not happen very often. The only case coming to my mind is when a framework is already removed when the shutdown messages reaches the agent. Hence I would suggest we leave a warning for now. If it stays problematic, we can revisit. - Alexander Rukletsov On July 21, 2016, 8:39 a.m., Jacob Janco wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50268/ > ----------------------------------------------------------- > > (Updated July 21, 2016, 8:39 a.m.) > > > Review request for mesos, Alexander Rukletsov and Jiang Yan Xu. > > > Bugs: MESOS-5874 > https://issues.apache.org/jira/browse/MESOS-5874 > > > Repository: mesos > > > Description > ------- > > Target shutdownFramework to associated agents. > > > Diffs > ----- > > src/master/master.cpp 78a8889313179b4509a6657e6c61d9f6d3bb99c0 > src/slave/slave.cpp da643e6e50b2f313705d2f862c961291aa5d2f22 > > Diff: https://reviews.apache.org/r/50268/diff/ > > > Testing > ------- > > 5 frameworks on 1 agent with 2 agents in cluster > pre: ShutDownFrameworkMessage sent to both agents. > post: ShutdownFrameworkMessage sent to agent with executors associated with > framework. > > > Thanks, > > Jacob Janco > >
