----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19795/#review40676 -----------------------------------------------------------
We need some tests! There are quite a few new continuations added but there are quite a few race conditions that are either not properly handled or commented. For places where I asked for comments, can you add tests that will verify the races are handled properly? src/slave/slave.hpp <https://reviews.apache.org/r/19795/#comment73738> s/with launched/when the 'launched'/ src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73740> Should we retry while we are waiting on the executor infos to be ready? How about doing the retry inside _doReliableRegistration() after we send the ReregisterSlaveMessage. if (info.id() == "") { // Send RegisterSlaveMessage. delay(....); } else { ... collect(...) } src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73745> CHECK(info.id() != ""); src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73746> Can you print the executor id to make it easy to debug? CHECK_SOME(executor->info) << "ExecutorInfo is not set for executor " << executor->id << " of framework " << framework->id; src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73748> do the delay() here. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73749> Can you add a comment here saying the framework couldn't have been removed because it has a pending task? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73750> What guarantees that the executor is not removed when we are here? Add a comment. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73751> How can the slave be in RECOVERING state here? Also, do we want to proceed if slave is in TERMINATING state? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73752> Do we want to proceed if framework is TERMINATING? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73753> What guarantees that the executor won't be removed? Add a comment. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73754> Pull this to the top. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73755> Again what's the guarantee? Comment please. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73761> kill new line. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73760> pull this up. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73763> What's the guarantee. Comment. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73767> should this be "!task.has_executor()" from the comment above? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73768> should this be "!task.has_executor()" from the comment above? anyway, why not always check if the id matches? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73765> do this after the log message. Also mention in the log message that you are killing the executor. src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73771> Kill the second sentence? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73772> Can you add a comment on when this is possible? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73774> << "Executor '" << executorId << "' of " << "framework " << frameworkId << " is already running"; src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73775> Remove this? What happens when containerizer returns a future? src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73776> s/checkpoint/commandExecutor/ src/slave/slave.cpp <https://reviews.apache.org/r/19795/#comment73778> why false? - Vinod Kone On April 17, 2014, 3:45 p.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19795/ > ----------------------------------------------------------- > > (Updated April 17, 2014, 3:45 p.m.) > > > Review request for mesos, Ian Downes and Vinod Kone. > > > Bugs: MESOS-922 > https://issues.apache.org/jira/browse/MESOS-922 > > > Repository: mesos-git > > > Description > ------- > > This is the 2nd part of the task-info patch split > (https://reviews.apache.org/r/18403/) and changes Executor::info to an > executor info future. > This is motivated by delegating executor info creation/choice to the > containerizer to address new container/executor scenarios > (https://issues.apache.org/jira/browse/MESOS-922). > > This patch use the new Executor::info and introduces new continuations to > deal with launching containers i.e. executor infos are to be determined. > > > Diffs > ----- > > src/slave/http.cpp 70e409a > src/slave/slave.hpp 1e98795 > src/slave/slave.cpp 19c5f0d > src/tests/containerizer.cpp bfb9341 > > Diff: https://reviews.apache.org/r/19795/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Niklas Nielsen > >
