> On Jan. 24, 2013, 8:25 p.m., Benjamin Hindman wrote: > > src/slave/flags.hpp, line 111 > > <https://reviews.apache.org/r/8736/diff/2/?file=243208#file243208line111> > > > > I'd like to make sure we understand and clearly document the semantics > > of what it means to have the default '--recovery=reconnect' when the slave > > doesn't actually have any old slave state to reconnect to.
updated the comment. lemme if it looks ok. > On Jan. 24, 2013, 8:25 p.m., Benjamin Hindman wrote: > > src/slave/slave.cpp, line 1456 > > <https://reviews.apache.org/r/8736/diff/2/?file=243209#file243209line1456> > > > > I do not like the semantics of an error getting returned and us > > deciding that we've "successfully" recovered! Considering using a Result to > > distinguish _no_ slave state to recover from an actual error. This needs to > > get done everywhere that an error is swallowed (even though it's logged). we always exit when recovery fails now. > On Jan. 24, 2013, 8:25 p.m., Benjamin Hindman wrote: > > src/slave/slave.cpp, lines 1466-1468 > > <https://reviews.apache.org/r/8736/diff/2/?file=243209#file243209line1466> > > > > Invoking recover on the status update manager is asynchronous. In the > > event there was an error recovering, I don't think we should set 'recovered > > = true' so optimistically! Made 'recover' asynchronous. > On Jan. 24, 2013, 8:25 p.m., Benjamin Hindman wrote: > > src/slave/status_update_manager.cpp, line 175 > > <https://reviews.apache.org/r/8736/diff/2/?file=243211#file243211line175> > > > > I don't like that semantics of using the "checkpointing" flag to "hack" > > the replaying of checkpoints. It seems like replying a stream should be a > > fundamental primitive, and shouldn't require a breach of isolation. Let's > > really separate the concerns here such that the StatusUpdateManager doesn't > > need to concern itself with the internals of the StatusUpdateStream's > > implementation. agreed, refactored. > On Jan. 24, 2013, 8:25 p.m., Benjamin Hindman wrote: > > src/slave/status_update_manager.cpp, line 138 > > <https://reviews.apache.org/r/8736/diff/2/?file=243211#file243211line138> > > > > Is there a non-hacky way to do this? If so, can you leave a TODO that > > describes what you're thinking for posterity? “I never put off till tomorrow what I can possibly do - the day after.” -- O.W. fixed. - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8736/#review15590 ----------------------------------------------------------- On Dec. 25, 2012, 7:57 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8736/ > ----------------------------------------------------------- > > (Updated Dec. 25, 2012, 7:57 a.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > Status update manager now recovers the update stream and re-sends any pending > updates. > > > Diffs > ----- > > src/master/master.cpp fbf7f0775392e3d21a6986203ed8f11729ffda8c > src/slave/flags.hpp 39e57f4104ee7a1538436ebbb9493581e28c99dd > src/slave/slave.cpp 9755b46f97173d6fcc9ab1fd63e0e4814b3bc018 > src/slave/status_update_manager.hpp PRE-CREATION > src/slave/status_update_manager.cpp PRE-CREATION > src/tests/slave_recovery_tests.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/8736/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Vinod Kone > >
