----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20221/#review40269 -----------------------------------------------------------
src/slave/slave.cpp <https://reviews.apache.org/r/20221/#comment73234> I think what this is saying is: If we have a valid run (determined in the codce above) then we're sure to have a checkpointed ExecutorInfo because the ExecutorInfo is checkpointed before we checkpoint any information about a run. But is it possible that a run is valid but for whatever reason recovering the ExecutorInfo fails? For example, because the file got corrupted, or by accidentally deleted? - Benjamin Hindman On April 10, 2014, 8:26 p.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/20221/ > ----------------------------------------------------------- > > (Updated April 10, 2014, 8:26 p.m.) > > > Review request for mesos, Ian Downes and Vinod Kone. > > > Repository: mesos-git > > > Description > ------- > > This patch let executor recovery recover runs in the absence of > executor info. This is needed as new task-info patch will introduce > an intermediate state where the executor info hasn't been check > pointed. In this interim, the slave may fail-over and should be in a > position to clean up orphan containers (as for now, the containerizer > API doesn't provide a way to reconcile the executor info and it is > therefore not possible to recover the containers in this case). > > > Diffs > ----- > > src/slave/slave.cpp cddb241 > src/slave/state.cpp 21d1fb7 > > Diff: https://reviews.apache.org/r/20221/diff/ > > > Testing > ------- > > make check and tested with task-info patch and new launch test. > > > Thanks, > > Niklas Nielsen > >
