----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23912/#review49552 -----------------------------------------------------------
src/slave/slave.hpp <https://reviews.apache.org/r/23912/#comment86682> Can you add a comment here saying you made these virtual so that you can mock the Slave? src/slave/slave.cpp <https://reviews.apache.org/r/23912/#comment86683> thank you. src/slave/slave.cpp <https://reviews.apache.org/r/23912/#comment86882> I would recommend pulling this logic outside 'if (executor == NULL)' to #1326 to make it easy to reason about. also, task ids are globally unique. there shouldn't be multiple executors with the same task id. you could actually do a CHECK(executorIds.size() == 1) to confirm that. src/slave/slave.cpp <https://reviews.apache.org/r/23912/#comment86884> s/executor is running/it was launched/ src/slave/slave.cpp <https://reviews.apache.org/r/23912/#comment86883> s/Task killed early/Task killed before it was launched/ src/tests/mesos.hpp <https://reviews.apache.org/r/23912/#comment86873> src/tests/mesos.hpp <https://reviews.apache.org/r/23912/#comment86874> src/tests/mesos.hpp <https://reviews.apache.org/r/23912/#comment86875> src/tests/mesos.cpp <https://reviews.apache.org/r/23912/#comment86860> end with a period. src/tests/mesos.cpp <https://reviews.apache.org/r/23912/#comment86861> were you unable to directly call Slave::* methods here instead of having the redirection through unmocked_* methods? src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86863> unused? src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86864> To simplify this boiler plate you could use the LaunchTasks action. See TerminatingSlaveDoesNotReregister test in this file for an example. src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86871> Kill these expectations since an executor won't be launched in this test. src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86867> s/Slave::_runTask/'Slave::_runTask()'/ src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86868> s/Slave::_runTask/'Slave::_runTask()'/ src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86876> s/Slave::runTask/'Slave::runTask()'/ src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86877> s/Slave::killTask/'Slave::killTask()'/ src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86878> s/MockedSlave::unmocked__runTask/'MockedSlave::unmocked__runTask()'/ src/tests/slave_tests.cpp <https://reviews.apache.org/r/23912/#comment86872> kill this. - Vinod Kone On July 28, 2014, 8:24 p.m., Bernd Mathiske wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23912/ > ----------------------------------------------------------- > > (Updated July 28, 2014, 8:24 p.m.) > > > Review request for mesos. > > > Bugs: MESOS-947 > https://issues.apache.org/jira/browse/MESOS-947 > > > Repository: mesos-git > > > Description > ------- > > Fixes MESOS-947 "Slave should properly handle a killTask() that arrives > between runTask() and _runTask()". > > Slave::killTask() did not check for task in question combination to be > "pending" (i.e. Slave::runTask had happened, but Slave::_runTask had not yet) > and then erroneously assumed that Slave::runTask() had not been executed. The > task was then marked "LOST" instead of "KILLED". But Slave::runTask had > already scheduled Slave::_runTask to follow. Now the entry for being > "pending" is removed, and the task is marked "KILLED", and _runTask gets > informed about this. It checks whether the task in question is currently > "pending" and if it is not, then it infers that the task has been killed and > does not erroneously try to complete launching it. > > > Diffs > ----- > > src/slave/slave.hpp a896bb66db5d8cd27ef02b6498c9db93cb0d525f > src/slave/slave.cpp 1d5691836822c8587e1aa8ed24860a8012c67a6e > src/tests/mesos.hpp 75c66fda2485afa0d4541e710780d90b3411839a > src/tests/mesos.cpp 35c94fa908ad728ea92a7d1bfcbe90d57b1b83d9 > src/tests/slave_tests.cpp e45255a6f699e51bf09397da95a5a11edbabe591 > > Diff: https://reviews.apache.org/r/23912/diff/ > > > Testing > ------- > > Wrote a unit test that reliably created the situation described in the > ticket. Observed that TASK_LOST and the listed log output occurred. This > pointed directly to the lines in killTask() where the problem is rooted. Ran > the test after fixing, it succeeded. Checked the log. It looks like a "clean > kill" now :-) > > > Thanks, > > Bernd Mathiske > >