-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53097/
-----------------------------------------------------------
Review request for mesos and Vinod Kone.
Bugs: MESOS-6445
https://issues.apache.org/jira/browse/MESOS-6445
Repository: mesos
Description
-------
If the master fails over and an agent does not re-register within the
`agent_reregister_timeout`, the master marks the agent as unreachable in
the registry and sends `slaveLost` for it. However, we neglected to
update the master's in-memory state for the newly unreachable agent;
this meant that task reconciliation would return incorrect results
(until/unless the next master failover).
Diffs
-----
src/master/master.hpp 6d2db9de52d35f3288c618d05138413ce709818b
src/master/master.cpp 3f3ce93155069dd32731783ac4877ba6ee2519c0
src/tests/master_tests.cpp 033fae336d107f16f7764b94117a9396df6cd80e
Diff: https://reviews.apache.org/r/53097/diff/
Testing
-------
`make check`
`./src/mesos-tests --gtest_filter="MasterTest.UnreachableTaskAfterFailover"
--gtest_repeat=1000 --gtest_break_on_failure`
Thanks,
Neil Conway