-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
-----------------------------------------------------------

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6803
    https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
-------

Currently when a new master is detected and no credential is provided,
the agent will attempt to (re-)register after some random initial
`delay`, to avoid thundering herds. It is hence possible to have
spurious double-registrations, since a new master could be detected
after we add the `delay`'d registration, but before we execute it.

To resolve this problem, we add a member, `agentRegistrationTimer` to
the agent, and call `Clock::cancel` on it when we successfully register
with the master.


Diffs
-----

  src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79 
  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54909/diff/


Testing
-------

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to 
catch intermittent failures, which is how we caught the failing test in 
`reservation_tests.cpp`. Note that this bug was discovered when we added a 
`delay` to the call to `authenticate` in `slave::detected` (in order to get it 
to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer

Reply via email to