-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
-----------------------------------------------------------
Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich,
and Joseph Wu.
Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803
Repository: mesos
Description
-------
Currently when a new master is detected and no credential is provided,
the agent will attempt to (re-)register after some random initial
`delay`, to avoid thundering herds. It is hence possible to have
spurious double-registrations, since a new master could be detected
after we add the `delay`'d registration, but before we execute it.
To resolve this problem, we add a member, `agentRegistrationTimer` to
the agent, and call `Clock::cancel` on it when we successfully register
with the master.
Diffs
-----
src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79
src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18
src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71
Diff: https://reviews.apache.org/r/54909/diff/
Testing
-------
`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to
catch intermittent failures, which is how we caught the failing test in
`reservation_tests.cpp`. Note that this bug was discovered when we added a
`delay` to the call to `authenticate` in `slave::detected` (in order to get it
to match the behavior of the non-authenticated call to `doReliableRegistration`.
Thanks,
Alex Clemmer