----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16290/#review30463 -----------------------------------------------------------
Partial review so far, mostly looked at the changes to the Master here. src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58285> Why this CHECK is restricted to only the failure case? And is it still applicable? Is there no race here between the detector and contender notifications? src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58281> This seems like an unexpected failure than an expected event so I would use LOG(FATAL) here. Is my understanding correct here? src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58286> Move to the top, as in done in lostCandidacy(). Can you make this more explicit, then there's no need for the << operator: CHECK(_!contended.isDiscarded()); The "MasterContender" part is a bit misleading since we're attributing the discard to the MasterContender without knowing who has actually discarded it. src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58289> Is there a reason we must detect after we've entered the contest? The result here is that there are now two cycles of detection going on at once, the detected() callback can initiate a call to detect() and the contended() callback and initiate a call to detect(). What about making the contend/detect processes completely independent in the Master, this should be the easiest to think about and would ensure we avoid things like redundant detect() calls. src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58287> Ditto on attributing the discard to MasterContender, and just killing the << operator. src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58284> Should this failure also be fatal? src/master/master.cpp <https://reviews.apache.org/r/16290/#comment58290> Looks like this should only CHECK against discarded, and the failure case should be handled as FATAL. src/zookeeper/group.hpp <https://reviews.apache.org/r/16290/#comment58280> Can you add a note about what aborting means here? Effectively the group enters an error state and all subsequent operations will fail, right? src/zookeeper/group.hpp <https://reviews.apache.org/r/16290/#comment58279> Option<Error> src/zookeeper/group.cpp <https://reviews.apache.org/r/16290/#comment58278> "to cleaning up"? - Ben Mahler On Dec. 16, 2013, 10:08 a.m., Jiang Yan Xu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/16290/ > ----------------------------------------------------------- > > (Updated Dec. 16, 2013, 10:08 a.m.) > > > Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. > > > Bugs: MESOS-883 > https://issues.apache.org/jira/browse/MESOS-883 > > > Repository: mesos-git > > > Description > ------- > > See summary. > > > Diffs > ----- > > src/master/master.cpp dd6111946289b82008a66e46df8ef5e538de70ea > src/sched/sched.cpp c46535643ce2ea456bd939571fe78358c1d3871b > src/slave/slave.cpp 6e6107e3551f29566ff233dad47dac8c29b4fab5 > src/tests/master_contender_detector_tests.cpp > 76464eab479461e6e3cb8b5afe85860e60428cf5 > src/tests/zookeeper_tests.cpp a0660cbfb1b7073654b05b82bffd69fbb04d0165 > src/zookeeper/contender.hpp d0b386eeec485f7c8e6b5e60936a84d2f0bba92d > src/zookeeper/contender.cpp bb7e25555b2635398e9d56a808993bcd0b0e47f6 > src/zookeeper/group.hpp 088363a80108472ffeb14121fcb4da02d6e9acf4 > src/zookeeper/group.cpp f3ba86fec3584a83fd55d6388656ebde5249824a > > Diff: https://reviews.apache.org/r/16290/diff/ > > > Testing > ------- > > make check. > > > Thanks, > > Jiang Yan Xu > >
