----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65550/#review197223 -----------------------------------------------------------
src/launcher/default_executor.cpp Lines 539-546 (original), 537 (patched) <https://reviews.apache.org/r/65550/#comment277363> Good catch, I'll remove the CHECK. src/launcher/default_executor.cpp Line 558 (original), 549 (patched) <https://reviews.apache.org/r/65550/#comment277362> The checker process will treat connection errors as transient failures, and reschedule the check: https://github.com/apache/mesos/blob/a86ff8c36532f97b6eb6b44c6f871de24afbcc4d/src/checks/checker_process.cpp#L531-L538 Transient failures are logged, but not treated as a health check failure: https://github.com/apache/mesos/blob/a86ff8c36532f97b6eb6b44c6f871de24afbcc4d/src/checks/checker_process.cpp#L353-L356 src/launcher/default_executor.cpp Lines 626-631 (original), 617-622 (patched) <https://reviews.apache.org/r/65550/#comment277361> If the executor isn't subscribed, the stauts updates will be added to the `unacknowledgedUpdates` map, and sent by `doReliableRegistration()` in the next `SUBSCRIBE` call: https://github.com/apache/mesos/blob/a86ff8c36532f97b6eb6b44c6f871de24afbcc4d/src/launcher/default_executor.cpp#L309-L343 The executor doesn't wait for the updates to be ack'd before shutting down (https://github.com/apache/mesos/blob/a86ff8c36532f97b6eb6b44c6f871de24afbcc4d/src/launcher/default_executor.cpp#L1020-L1024), so there's a possibility that these updates will be dropped if the executor is not connected to the agent upon disconnection. This is tracked in https://issues.apache.org/jira/browse/MESOS-8537. - Gaston Kleiman On Feb. 7, 2018, 11 a.m., Gaston Kleiman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65550/ > ----------------------------------------------------------- > > (Updated Feb. 7, 2018, 11 a.m.) > > > Review request for mesos, Anand Mazumdar, Qian Zhang, and Vinod Kone. > > > Bugs: MESOS-8468 > https://issues.apache.org/jira/browse/MESOS-8468 > > > Repository: mesos > > > Description > ------- > > The default executor would unnecessarily shutdown if, while launching a > task group, it gets unsubscribed after having successfully launched the > task group's containers. > > > Diffs > ----- > > src/launcher/default_executor.cpp 4a619859095cc2d30f4806813f64a2e48c83b3ea > > > Diff: https://reviews.apache.org/r/65550/diff/1/ > > > Testing > ------- > > `make check` on GNU/Linux > > > Thanks, > > Gaston Kleiman > >
