----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61645/#review183678 -----------------------------------------------------------
Fix it, then Ship it! src/slave/slave.hpp Lines 882 (patched) <https://reviews.apache.org/r/61645/#comment259746> s/will be/will also be/ ? src/slave/slave.cpp Lines 1820 (patched) <https://reviews.apache.org/r/61645/#comment259747> Can you add a TODO here to track pending tasks on the executor struct instead of framework? We might even be able to leverage queued tasks. - Vinod Kone On Aug. 23, 2017, 9:29 p.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61645/ > ----------------------------------------------------------- > > (Updated Aug. 23, 2017, 9:29 p.m.) > > > Review request for mesos, Anand Mazumdar and Vinod Kone. > > > Bugs: MESOS-7783 and MESOS-7863 > https://issues.apache.org/jira/browse/MESOS-7783 > https://issues.apache.org/jira/browse/MESOS-7863 > > > Repository: mesos > > > Description > ------- > > Per the description of MESOS-7863, there is currently an assumption > that when a pending task is killed, the framework will be stored in > the agent when the launch proceeds for the killed task. When this > assumption does not hold, the TASK_KILLED update will be dropped > due to the frameowrk being unknown when the launch proceeds. This > assumption doesn't hold in two cases: > > (1) Another pending task was killed and we removed the framework > in 'Slave::run' thinking it was idle, because pending tasks > was empty (we remove from pending tasks when processing the > kill). (MESOS-7783 is an example instance of this). > > (2) The last executor terminated without tasks to send terminal > updates for, or the last terminated executor received its > last acknowledgement. At this point, we remove the framework > thinking there were no pending tasks if the task was killed > (removed from pending). > > The fix applied here is to send the status updates from the kill > path rather than the launch path, to be consistent with how we kill > tasks queued within the Executor struct. We ensure that the task > is removed synchronously within the kill path to prevent its launch. > > > Diffs > ----- > > src/slave/slave.hpp 0e07a1af733003bb897cbebb7c1f64437063353d > src/slave/slave.cpp 50d2a10cd68f6611efd4e691e5325e6e0c06f33a > src/tests/slave_tests.cpp a8b46a450c4f3986139390aeed5544e7f9091781 > > > Diff: https://reviews.apache.org/r/61645/diff/3/ > > > Testing > ------- > > Added a test in a subsequent patch. > > > Thanks, > > Benjamin Mahler > >
