----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10724/#review19592 -----------------------------------------------------------
src/master/master.hpp <https://reviews.apache.org/r/10724/#comment40464> Because TaskIDs are not globally unique, right? src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40470> Is this sufficient? I think you need to key on the FrameworkID since TaskIDs are not globally unique. src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40473> Didn't you originally want a CHECK against this? I do like the warning, but we should definitely add a TODO to export a statistic for this! src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40471> s/task/unknown task/ src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40472> kill src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40475> Ditto (from below) on the phrasing: "Send TASK_LOST updates for tasks present in the master, but missing from the slave". src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40476> Can you add context to this message? "Sending TASK_LOST for ..." src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40477> Hmm.. maybe a little more context here as well: "Task was lost during slave re-registration"? src/master/master.cpp <https://reviews.apache.org/r/10724/#comment40465> Why the change here? This added comment doesn't appear to match the change here..? src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/10724/#comment40466> Strange sentence, how about: "for tasks in the master that are not in the re-registered slave" src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/10724/#comment40468> Any reason you're not using the cluster abstraction, seems that all new tests going forward should. I also introduced some FaultToleranceClusterTest tests in this file already. src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/10724/#comment40467> We have to stop adding these! ;) src/tests/fault_tolerance_tests.cpp <https://reviews.apache.org/r/10724/#comment40469> // We now launch a task and drop the corresponding RunTaskMessage on the slave, to ensure that only the master knows about this task. - Ben Mahler On April 23, 2013, 5:02 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10724/ > ----------------------------------------------------------- > > (Updated April 23, 2013, 5:02 a.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > See summary. > > > Diffs > ----- > > src/master/master.hpp 9776a7cb8448e41e5d52288e3c637737cee15a08 > src/master/master.cpp c3b26b136a529eee34e9cdf9700176c232f6e436 > src/tests/fault_tolerance_tests.cpp > 0348f20a8f4333f7d2f3786c33e55713cbcbcbe0 > src/tests/slave_recovery_tests.cpp d0c72738ca6fcc0ccf7233efe0ae7ab243fa1f4b > > Diff: https://reviews.apache.org/r/10724/diff/ > > > Testing > ------- > > make check > > sudo GLOG_v=1 ./bin/mesos-tests.sh > --gtest_filter="*ConsolidateTasksOnSlaveReregistration*" --verbose > --gtest_repeat=1000 --gtest_break_on_failure > > > Thanks, > > Vinod Kone > >
