Neil Conway created MESOS-6231:

             Summary: Scheduler driver metrics can hang Metrics() in tests
                 Key: MESOS-6231
             Project: Mesos
          Issue Type: Bug
          Components: tests
            Reporter: Neil Conway

* {{SchedulerProcess}} has a field, {{metrics}}, whose constructor registers 
two metrics, {{event_queue_messages}} and {{event_queue_dispatches}}.
* These metrics are implemented by {{defer}}'ing a message to 
* If {{MesosSchedulerDriver}} is started and then stopped (but not destructed), 
{{SchedulerProcess}} is terminated but not destroyed.

Hence, if a scheduler driver is started and then stopped, fetching the metric 
will hang. This means a test case that fetches {{Metrics()}} after stopping a 
scheduler driver will hang.

For example, the following patch will hang 

diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp
index 3471314..f323bb9 100644
--- a/src/tests/slave_tests.cpp
+++ b/src/tests/slave_tests.cpp
@@ -1408,12 +1408,12 @@ TEST_F(SlaveTest, MetricsSlaveLaunchErrors)
   ASSERT_EQ(TASK_FAILED, failureUpdate.get().state());

+  driver.stop();
+  driver.join();
   // After failure injection, metrics should report a single failure.
   snapshot = Metrics();
   EXPECT_EQ(1, snapshot.values["slave/container_launch_errors"]);
-  driver.stop();
-  driver.join();


This message was sent by Atlassian JIRA

Reply via email to