----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26699/#review56604 -----------------------------------------------------------
Looking good. I'm glad to see some attention paid to the status update mechanism. We've run into a couple of issues in this area lately. src/slave/slave.cpp <https://reviews.apache.org/r/26699/#comment96990> Couldn't the Slave and the SUM get out of sync here? Right now, the SUM will flush its pending status updates as soon as a new master is detected. I'm imagining a scenario where the SUM is flushing status updates and the slave handles a status ACK interleaved with a slave re-registration delivering stale or out-of-sync task states. Wouldn't it just be better if the SUM didn't flush until after the slave has successfully re-registered? src/slave/slave.cpp <https://reviews.apache.org/r/26699/#comment96987> Please update the comment to reflect that a launched or terminated task will also contain the state from its oldest pending/unacknowledged status update. src/tests/slave_tests.cpp <https://reviews.apache.org/r/26699/#comment97039> Times(1) is implicit on a vanilla EXPECT_CALL src/tests/slave_tests.cpp <https://reviews.apache.org/r/26699/#comment97040> Verify that it's actually a TASK_RUNNING? - Adam B On Oct. 14, 2014, 11:03 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/26699/ > ----------------------------------------------------------- > > (Updated Oct. 14, 2014, 11:03 a.m.) > > > Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. > > > Bugs: MESOS-1799 and MESOS-1817 > https://issues.apache.org/jira/browse/MESOS-1799 > https://issues.apache.org/jira/browse/MESOS-1817 > > > Repository: mesos-git > > > Description > ------- > > Slave re-registration now sends both the latest state and unacknowledged > state to the master. > > > Diffs > ----- > > src/slave/slave.hpp 342b09fc084c20d98d096bb129830440179c092c > src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 > src/tests/fault_tolerance_tests.cpp > a75910d4f486230ba3f1d8927e5f1e5fda6e287b > src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f > > Diff: https://reviews.apache.org/r/26699/diff/ > > > Testing > ------- > > make check > > Ran new test 1000 times. > > > Thanks, > > Vinod Kone > >
