Repository: mesos Updated Branches: refs/heads/master e5bc824fb -> 06195ea0d
Fixed a potential race condition in the test infrastructure. There was a race condition leading to flaky `LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags` test. This test launches successively multiple agents, while reusing the same variable. After reassigning the value of the variable, agent's d-tor is called. If agent recovery is not yet completed, then some orphaned container might blink in the agent's d-tor as it is described in the comment to the code. Review: https://reviews.apache.org/r/64770/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/06195ea0 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/06195ea0 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/06195ea0 Branch: refs/heads/master Commit: 06195ea0d5a24b493b9fb2c76a4986478ea479c5 Parents: e5bc824 Author: Andrei Budnik <abud...@mesosphere.com> Authored: Thu Jan 11 22:21:41 2018 +0100 Committer: Alexander Rukletsov <al...@apache.org> Committed: Thu Jan 11 22:21:41 2018 +0100 ---------------------------------------------------------------------- src/tests/cluster.cpp | 14 ++++++++++++++ 1 file changed, 14 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/06195ea0/src/tests/cluster.cpp ---------------------------------------------------------------------- diff --git a/src/tests/cluster.cpp b/src/tests/cluster.cpp index 05bde60..066dd31 100644 --- a/src/tests/cluster.cpp +++ b/src/tests/cluster.cpp @@ -613,6 +613,20 @@ Slave::~Slave() return; } + // We should wait until agent recovery completes to prevent a potential race + // between containerizer recovery process and the following code that invokes + // methods of the containerizer, e.g. a test can start an agent that in turn + // triggers containerizer recovery of orphaned containers, then immediately + // destroys the agent. Thus, the containerizer might return a different set of + // containers, depending on whether containerizer recovery has been finished. + // + // NOTE: This wait is omitted if a pointer to a containerizer object was + // passed to the slave's constructor, as it might be a mock containerizer, + // thereby agent recovery will never be finished. + if (ownedContainerizer.get() != nullptr) { + slave->recoveryInfo.recovered.future().await(); + } + terminate(); // This extra closure is necessary in order to use `AWAIT` and `ASSERT_*`,