Repository: mesos Updated Branches: refs/heads/master 58bfa80c6 -> 8b4d83aec
Made default executor handle shutdown events while disconnected. Previously, the default executor used to crash with a failed assertion when the executor library injected a shutdown event when it noticed a disconnection with the agent for non-checkpointed frameworks and upon recovery timeout for checkpointed frameworks. This change modifies it to commit suicide minus the failed assertion. Review: https://reviews.apache.org/r/52755/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/a1dc1d33 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/a1dc1d33 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/a1dc1d33 Branch: refs/heads/master Commit: a1dc1d33adce98007a7a88174048f46e4afe1782 Parents: 58bfa80 Author: Anand Mazumdar <an...@apache.org> Authored: Wed Oct 12 17:34:36 2016 -0700 Committer: Vinod Kone <vinodk...@gmail.com> Committed: Wed Oct 12 17:37:00 2016 -0700 ---------------------------------------------------------------------- src/launcher/default_executor.cpp | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/a1dc1d33/src/launcher/default_executor.cpp ---------------------------------------------------------------------- diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index 2454bd7..af4a97f 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -683,8 +683,6 @@ protected: return; } - CHECK_EQ(SUBSCRIBED, state); - LOG(INFO) << "Shutting down"; shuttingDown = true; @@ -694,6 +692,18 @@ protected: return; } + // It is possible that the executor library injected the shutdown event + // upon a disconnection with the agent for non-checkpointed + // frameworks or after recovery timeout for checkpointed frameworks. + // This could also happen when the executor is connected but the agent + // asked it to shutdown because it didn't subscribe in time. + if (state == CONNECTED || state == DISCONNECTED) { + __shutdown(); + return; + } + + CHECK_EQ(SUBSCRIBED, state); + process::http::connect(agent) .onAny(defer(self(), &Self::_shutdown, lambda::_1)); }