Repository: mesos Updated Branches: refs/heads/1.2.x a7fb1fbce -> fdadfa416
Ignored the tasks already being killed when killing the task group. When the scheduler tries to kill multiple tasks in the task group simultaneously, the default executor will kill the tasks one by one. When the first task is killed, the default executor will kill all the other tasks in the task group, however, we need to ignore the tasks which are already being killed, otherwise, the check `CHECK(!container->killing);` in `DefaultExecutor::kill()` will fail. Review: https://reviews.apache.org/r/62836 Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/d381f730 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/d381f730 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/d381f730 Branch: refs/heads/1.2.x Commit: d381f730aaed386c3013d234b6591d0ae62dcf10 Parents: a7fb1fb Author: Qian Zhang <zhq527...@gmail.com> Authored: Mon Oct 9 09:01:15 2017 +0800 Committer: Qian Zhang <zhq527...@gmail.com> Committed: Sat Nov 4 10:58:14 2017 +0800 ---------------------------------------------------------------------- src/launcher/default_executor.cpp | 8 ++++++++ 1 file changed, 8 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/d381f730/src/launcher/default_executor.cpp ---------------------------------------------------------------------- diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index f30947e..c03df7c 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -724,6 +724,14 @@ protected: Owned<Container> container_ = containers.at(taskId); container_->killingTaskGroup = true; + // Ignore if the task is already being killed. This can happen + // when the scheduler tries to kill multiple tasks in the task + // group simultaneously and then one of the tasks is killed + // while the other tasks are still being killed, see MESOS-8051. + if (container_->killing) { + continue; + } + kill(container_); } }