----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51404/#review146745 -----------------------------------------------------------
Fix it, then Ship it! src/slave/slave.cpp (line 3468) <https://reviews.apache.org/r/51404/#comment213402> s/updateTaskState/updated/ like you did in recover? src/slave/slave.cpp (line 6201) <https://reviews.apache.org/r/51404/#comment213403> remove this CHECK because it is not intuitive that it follows from the comment. - Vinod Kone On Aug. 25, 2016, 1:29 a.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51404/ > ----------------------------------------------------------- > > (Updated Aug. 25, 2016, 1:29 a.m.) > > > Review request for mesos and Vinod Kone. > > > Bugs: MESOS-6026 > https://issues.apache.org/jira/browse/MESOS-6026 > > > Repository: mesos > > > Description > ------- > > Currently the agent will allow terminal status updates when updating > the state of a task, and only the status update manager is responsible > for dropping duplicate terminal updates. > > As a result of the race in MESOS-6026, the agent may send a duplicate > terminal update out-of-order. Because this patch rejects duplicate > terminal updates altogether, this case is no longer possible. With this > patch, the race will only lead to an error being logged (although we > should ideally be smart enough to not generate a TASK_FAILED when > the TASK_FINISHED has already arrived for the executor's task). > > In the process of fixing this, I removed the need for the agent to > call the separate `Executor::terminateTask()` function (which was > error-prone), and I've handled errors for unexpected task state updates. > > > Diffs > ----- > > src/slave/slave.hpp 7ca9923f97d731715db0267703d32cffc5badf0b > src/slave/slave.cpp c686a97149d3a279bea3e532109ba2215947fc4c > src/tests/slave_tests.cpp dcf84545354dd2ae0ab5acad3b15eca0467b9982 > src/tests/status_update_manager_tests.cpp > 7b6fe314ac5bef24bd1d2734a7e7c35a54530c88 > > Diff: https://reviews.apache.org/r/51404/diff/ > > > Testing > ------- > > make check > > I moved the status update manager test to become an agent test, > since the agent is now the one responsible for dropping duplicate > terminal updates. > > It would be good to also add a test that covers the case where > the containerizer update is delayed and another terminal update > arrives. > > > Thanks, > > Benjamin Mahler > >