-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61645/
-----------------------------------------------------------
Review request for mesos, Anand Mazumdar and Vinod Kone.
Bugs: MESOS-7783 and MESOS-7863
https://issues.apache.org/jira/browse/MESOS-7783
https://issues.apache.org/jira/browse/MESOS-7863
Repository: mesos
Description
-------
Per the description of MESOS-7863, there is currently an assumption
that when a pending task is killed, the framework will be stored in
the agent when the launch proceeds for the killed task. When this
assumption does not hold, the TASK_KILLED update will be dropped
due to the frameowrk being unknown when the launch proceeds. This
assumption doesn't hold in two cases:
(1) Another pending task was killed and we removed the framework
in 'Slave::run' thinking it was idle, because pending tasks
was empty (we remove from pending tasks when processing the
kill). (MESOS-7783 is an example instance of this).
(2) The last executor terminated without tasks to send terminal
updates for, or the last terminated executor received its
last acknowledgement. At this point, we remove the framework
thinking there were no pending tasks if the task was killed
(removed from pending).
The fix applied here is to send the status updates from the kill
path rather than the launch path, to be consistent with how we kill
tasks queued within the Executor struct. We ensure that the task
is removed synchronously within the kill path to prevent its launch.
Diffs
-----
src/slave/slave.hpp 1fe93dab1b2bef24721cc1bcffebe1b259e96d79
src/slave/slave.cpp 7381530515f86faf4c3e8f82bcd9483f6cf0498b
src/tests/slave_tests.cpp 1d9d142ed9e801b79535a2c28f5a94ffbf1bf160
Diff: https://reviews.apache.org/r/61645/diff/1/
Testing
-------
Added a test in a subsequent patch.
Thanks,
Benjamin Mahler