Repository: mesos
Updated Branches:
  refs/heads/master e5bc824fb -> 06195ea0d


Fixed a potential race condition in the test infrastructure.

There was a race condition leading to flaky
`LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags` test.
This test launches successively multiple agents, while reusing the
same variable. After reassigning the value of the variable, agent's
d-tor is called. If agent recovery is not yet completed, then some
orphaned container might blink in the agent's d-tor as it is
described in the comment to the code.

Review: https://reviews.apache.org/r/64770/


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/06195ea0
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/06195ea0
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/06195ea0

Branch: refs/heads/master
Commit: 06195ea0d5a24b493b9fb2c76a4986478ea479c5
Parents: e5bc824
Author: Andrei Budnik <abud...@mesosphere.com>
Authored: Thu Jan 11 22:21:41 2018 +0100
Committer: Alexander Rukletsov <al...@apache.org>
Committed: Thu Jan 11 22:21:41 2018 +0100

----------------------------------------------------------------------
 src/tests/cluster.cpp | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/06195ea0/src/tests/cluster.cpp
----------------------------------------------------------------------
diff --git a/src/tests/cluster.cpp b/src/tests/cluster.cpp
index 05bde60..066dd31 100644
--- a/src/tests/cluster.cpp
+++ b/src/tests/cluster.cpp
@@ -613,6 +613,20 @@ Slave::~Slave()
     return;
   }
 
+  // We should wait until agent recovery completes to prevent a potential race
+  // between containerizer recovery process and the following code that invokes
+  // methods of the containerizer, e.g. a test can start an agent that in turn
+  // triggers containerizer recovery of orphaned containers, then immediately
+  // destroys the agent. Thus, the containerizer might return a different set 
of
+  // containers, depending on whether containerizer recovery has been finished.
+  //
+  // NOTE: This wait is omitted if a pointer to a containerizer object was
+  // passed to the slave's constructor, as it might be a mock containerizer,
+  // thereby agent recovery will never be finished.
+  if (ownedContainerizer.get() != nullptr) {
+    slave->recoveryInfo.recovered.future().await();
+  }
+
   terminate();
 
   // This extra closure is necessary in order to use `AWAIT` and `ASSERT_*`,

Reply via email to