----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18403/#review37090 -----------------------------------------------------------
Looking good. The issues I'm really concerned about are returning false in Slave::executorStarted and initializing Executor::resources. I leave the rest up to Ian & Co. src/slave/containerizer/mesos_containerizer.cpp <https://reviews.apache.org/r/18403/#comment68461> indentation src/slave/containerizer/mesos_containerizer.cpp <https://reviews.apache.org/r/18403/#comment68458> Why not name this executorInfo? Then you wouldn't have to modify the references below. src/slave/http.cpp <https://reviews.apache.org/r/18403/#comment68388> if (executor.info.isSome()) ? src/slave/slave.hpp <https://reviews.apache.org/r/18403/#comment68455> Why remove the const? It's still set in the initializer list in both constructors. src/slave/slave.hpp <https://reviews.apache.org/r/18403/#comment68387> Please comment on the purpose of this future. Maybe rename it 'launched'? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68401> Will executor->info always exist if !commandExecutor? If not, you should check info.isSome(). Can't this also be changed to a CopyFrom, since executorInfo will be empty from start? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68402> Can we kill this now? We're long past 0.15. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68408> FYI, you won't need a separate frameworkId parameter after MESOS-905 lands, since it will be included in frameworkInfo.id. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68416> Duplicated from _runTask. (Assuming we need to check this again in case the framework transitioned to TERMINATING while waiting on future.) Should we factor this out into a reusable function? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68418> This block is also largely duplicated from _runTask, but with different error strings. Factor it out? Same question for the rest of the duplicated code in your new lambdas. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68466> Put back the double-spacing. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68422> This new comment is more relevant to the (executor->state == LAUNCHING) block below. Move it. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68424> We already check getFramework in registerExecutor, so this failure would mean that the framework was removed/deactivated while waiting on the future? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68427> Here, we're not even trying to register yet, we just waited for the executor to launch, and that's what failed. Will the future.failure() message indicate that? If not, perhaps this error message should be changed to "Could not register executor due to failure while launching: " src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68428> Add LAUNCHING as an 'unexpected state'? Or can we just assume that it will never be in that state because we've successfully waited on executor->future? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68429> Remove extra blank line. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68431> Again, the failure was technically during launch, not (re)registration. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68432> Remove extra blank line. src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68433> return false? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68434> return false? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68439> This happens when Slave still thinks that the executor is launching, but the container/executor reports back its termination? So then we wait for the slave's idea of the executor to settle out of the launching state, then handle its termination? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68443> Why did you remove the CHECK(!executors.contains(executorId))? Is it now okay to launch an executor whose executorId already exists in Framework.executors? src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68456> What about initializing resources? Should this be copied from info during finalize()? The other constructor sets resources(_info.resources()). src/slave/slave.cpp <https://reviews.apache.org/r/18403/#comment68451> I would expect an isFoobar() method to return a bool. If you're returning a Try<Nothing>, perhaps checkValidity()/testValidity() or more specifically checkExecutorIDs(). Where is this even used? I can't find any references. - Adam B On Feb. 27, 2014, 10:07 a.m., Niklas Nielsen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18403/ > ----------------------------------------------------------- > > (Updated Feb. 27, 2014, 10:07 a.m.) > > > Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Ian Downes, > Till Toenshoff, and Vinod Kone. > > > Bugs: MESOS-922 > https://issues.apache.org/jira/browse/MESOS-922 > > > Repository: mesos-git > > > Description > ------- > > This patch delegates the choice of executor to the containerizer by removing > executorInfo dependencies up until Containerizer::launch(). > Containerizer::launch() now returns a future to the executor info that is > being run and the slave creates the corresponding executor structure when > launch completes. > This means message handling from the running executor to the slave in the > interim where the executor structure has not created, need to be enqueued > until executor is ready. So far, registerExecutor() and reregisterExecutor() > has been split into two continuations to deal with this issue. > > > Diffs > ----- > > src/slave/containerizer/containerizer.hpp d9ae326 > src/slave/containerizer/mesos_containerizer.hpp ee1fd30 > src/slave/containerizer/mesos_containerizer.cpp 6d990cb > src/slave/http.cpp 594032d > src/slave/slave.hpp 01b80df > src/slave/slave.cpp 4f5349b > src/tests/containerizer.hpp 5686398 > src/tests/containerizer.cpp bfb9341 > > Diff: https://reviews.apache.org/r/18403/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Niklas Nielsen > >
