----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69267/#review210363 -----------------------------------------------------------
PASS: Mesos patch 69267 was successfully built and tested. Reviews applied: `['69267']` All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2574/mesos-review-69267 - Mesos Reviewbot Windows On Nov. 7, 2018, 1:26 a.m., Joseph Wu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69267/ > ----------------------------------------------------------- > > (Updated Nov. 7, 2018, 1:26 a.m.) > > > Review request for mesos, Alexander Rukletsov and Greg Mann. > > > Bugs: MESOS-6949 > https://issues.apache.org/jira/browse/MESOS-6949 > > > Repository: mesos > > > Description > ------- > > This test was flaky because there is a double-master-detection race > after the master fails over. This test uses the Standalone master > detector, which keeps a single Master PID in memory and always returns > that one PID as the leader. This means there is almost no delay > between failing over the master and detecting a new leader. > > The scheduler in this test tries to send a SUBSCRIBE call to the master > as soon as the master is detected. Normally, there will only be two > total SUBSCRIBE calls during the test, before and after the master > failover. However, the test also manually appoints the leader after > failing over the master. This step races against the scheduler's own > retry logic, and can potentially cause a third SUBSCRIBE if the second > SUBSCRIBE has already started. > > Because the scheduler in this test does not enable checkpointing, the > third SUBSCRIBE will actively disconnect the framework, causing the > master to remove the framework. This removal also prevents the > framework from ever registering again, and thereby times out the test. > > This fixes the test to prevent excess master detection events. > > We could also change the HTTP scheduler driver to ignore these extra > master detection events when the master in question has not changed. > > > Diffs > ----- > > src/tests/scheduler_tests.cpp 0ee5b77e5a667e37ac13553e15f634b2cb19ea65 > > > Diff: https://reviews.apache.org/r/69267/diff/1/ > > > Testing > ------- > > make check > > GLOG_v=1 src/mesos-tests --gtest_filter="*SchedulerTest.MasterFailover*" > --gtest_repeat=-1 --gtest_break_on_failure --verbose > > > Thanks, > > Joseph Wu > >
