----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62212/#review185157 -----------------------------------------------------------
For the default executor, a cleaner approach would be to initialize the `containers` collection earlier, right before (or together with) sending `TASK_STARTING`. If we rely on the fact that `containers` include only launched containers, we can introduce `active` flag into the `Container` struct and set it where now an instance of the struct is created. (Speaking about flags, we should have a proper state machine with clear transitions per container instead of a bunch of flags). Moreover, from what I see in the code, if a launch nested container has failed, no task updates are sent by executor at all (which makes agent to create an implicit update). It seems reasonable to me to send an update from the executor at least for the task that has failed to launch. In the future we will likely allow some tasks in a group to fail without killing the complete group, hence keeping task info as early as possible seems reasonable to me. src/docker/executor.cpp Line 138 (original), 138-139 (patched) <https://reviews.apache.org/r/62212/#comment261444> Please add the todo here: ``` // TODO(alexr): Use `protobuf::createTaskStatus()` // instead of manually setting fields. ``` src/launcher/default_executor.cpp Lines 1256 (patched) <https://reviews.apache.org/r/62212/#comment261446> Instead of short-circuiting here, I suggest to use `protobuf::createTaskStatus` directly, explaining in the comment at the call site why we do so. There is another alternative, more about it in the review header section. src/launcher/default_executor.cpp Lines 1317-1319 (original), 1333-1337 (patched) <https://reviews.apache.org/r/62212/#comment261447> Instead of checking a flag, why not replacing `CHECK` with a condition? - Alexander Rukletsov On Sept. 11, 2017, 9:16 a.m., Benno Evers wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62212/ > ----------------------------------------------------------- > > (Updated Sept. 11, 2017, 9:16 a.m.) > > > Review request for mesos, Andrei Budnik and Alexander Rukletsov. > > > Bugs: MESOS-7941 > https://issues.apache.org/jira/browse/MESOS-7941 > > > Repository: mesos > > > Description > ------- > > This gives schedulers more information about a tasks status, > in particular it gives a better estimate of a tasks start time > and helps differentiating between tasks stuck in TASK_STAGING > and tasks stuck in TASK_STARTING. > > > Diffs > ----- > > docs/high-availability-framework-guide.md > 73743aba31f9d0ca827d318e2ecb4752a91b1be0 > src/docker/executor.cpp e9949f652cd8527991ebfdfbf14e68b4c958fe79 > src/launcher/default_executor.cpp 106b7f2e0244d211c66b237b5d1c51f43fc6e529 > src/launcher/executor.cpp 951597b576b4912541dd87d52dcb981393e58082 > > > Diff: https://reviews.apache.org/r/62212/diff/1/ > > > Testing > ------- > > > Thanks, > > Benno Evers > >
