-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51404/
-----------------------------------------------------------

Review request for mesos and Vinod Kone.


Bugs: MESOS-6026
    https://issues.apache.org/jira/browse/MESOS-6026


Repository: mesos


Description
-------

Currently the agent will allow terminal status updates when updating
the state of a task, and only the status update manager is responsible
for dropping duplicate terminal updates.

As a result of the race in MESOS-6026, the agent may send a duplicate
terminal update out-of-order. Because this patch rejects duplicate
terminal updates altogether, this case is no longer possible. With this
patch, the race will only lead to an error being logged (although we
should ideally be smart enough to not generate a TASK_FAILED when
the TASK_FINISHED has already arrived for the executor's task).

In the process of fixing this, I removed the need for the agent to
call the separate `Executor::terminateTask()` function (which was
error-prone), and I've handled errors for unexpected task state updates.


Diffs
-----

  src/slave/slave.hpp 7ca9923f97d731715db0267703d32cffc5badf0b 
  src/slave/slave.cpp c686a97149d3a279bea3e532109ba2215947fc4c 
  src/tests/slave_tests.cpp dcf84545354dd2ae0ab5acad3b15eca0467b9982 
  src/tests/status_update_manager_tests.cpp 
7b6fe314ac5bef24bd1d2734a7e7c35a54530c88 

Diff: https://reviews.apache.org/r/51404/diff/


Testing
-------

make check

I moved the status update manager test to become an agent test,
since the agent is now the one responsible for dropping duplicate
terminal updates.

It would be good to also add a test that covers the case where
the containerizer update is delayed and another terminal update
arrives.


Thanks,

Benjamin Mahler

Reply via email to