----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11229/#review21561 -----------------------------------------------------------
Ship it! IIUC we can simplify Master::reconcileTasks, and I'd love to see that done before this is committed. I think we could also improve the test by pausing the clock sooner, doing everything else the test is doing, and then advancing time and actually seeing the TASK_FINISHED eventually make it's way through. That would be awesome. src/master/master.cpp <https://reviews.apache.org/r/11229/#comment44606> s/Consolidate/Reconcile/ src/master/master.cpp <https://reviews.apache.org/r/11229/#comment44605> I'm not sure I understand fully why we need to filter 'tasks' into 'slaveTasks'. Can't we just do the 'foreachvalue (Task* task, utils::copy(slave->tasks))' below and have a nested for loop which looks in 'tasks' to see if the slave knows about it? src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44607> s/PendingUpdates/IncompleteTasks/? Just going off your naming above ... src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44608> Kill this line. src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44611> Ideally we just drop the one, and the test later shows the TASK_FINISHED makes its way through ... src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44609> s/statusUpdate =/_statusUpdate =/ src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44610> Do you want to move this up before you do driver.launchTasks? I understand that you want to imply that when the slave reregisters a TASK_LOST will not make it's way to the scheduler, but it seems like we also want to capture that driver.launchTasks does not in fact cause statusUpdate to get invoked on the scheduler either. src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/11229/#comment44612> To really be sure about this you'll want to pause before you post the NewMasterDetectedMessage. - Benjamin Hindman On June 6, 2013, 8:40 p.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/11229/ > ----------------------------------------------------------- > > (Updated June 6, 2013, 8:40 p.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > Reopened the discarded review (includes refactor of task reconciliation) to > fix task reconciliation. > > > This addresses bug MESOS-447. > https://issues.apache.org/jira/browse/MESOS-447 > > > Diffs > ----- > > src/master/master.hpp 8e7b74cccf46bea8dd5cb8543bf083aed64d370a > src/master/master.cpp a2e4b905f1ef5c00560917c133b27ac978988807 > src/slave/slave.hpp 26dc96e5f2fdc0711fc49a9ea80b7f037bf93c29 > src/slave/slave.cpp 8ce1646f2f804bc8dc20506d11078191f0274654 > src/tests/fault_tolerance_tests.cpp > ef570b7e4b61df5e452628a5ea7c75888a0797ec > > Diff: https://reviews.apache.org/r/11229/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Vinod Kone > >
