----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70147/ -----------------------------------------------------------
(Updated March 7, 2019, 12:26 a.m.) Review request for mesos, Benjamin Mahler, Gastón Kleiman, Joseph Wu, and Meng Zhu. Bugs: MESOS-9460 https://issues.apache.org/jira/browse/MESOS-9460 Repository: mesos Description (updated) ------- This patch adds a new `Sequence` data member to the master which is used to prevent interleavings of master/allocator state updates which could lead to inconsistent state in the master and allocator actors. For example, the following interleaving of events would previously lead to inconsistent state between the master and allocator: 1) Master receives a RESERVE operation for agent A via the operator API. This invokes `Master::apply()`, which calls `allocator->updateAvailable()` for agent A. 2) Master receives an `UpdateSlaveMessage` containing oversubscribed resources from agent A. The `Master::updateSlave()` handler invokes `allocator->updateSlave()` which uses _stale_ resources from the `Slave` struct to update the allocator's view of agent A's resources. Once that event is processed by the allocator, the allocator will not include the reserved resources in agent A's total. 3) After the `allocator->updateAvailable()` call from #1 returns, `Master::_apply()` is invoked, which updates the `Slave` struct for agent A to include the reserved resources. The master and allocator's views of agent A's total resources are now inconsistent. Diffs ----- src/master/master.hpp 90e08149ece595147ca4a93da215385917a0f372 src/master/master.cpp b9db4ffd4ee8ea4a8e44a35d1afb6c1b8e03d74d Diff: https://reviews.apache.org/r/70147/diff/1/ Testing ------- `bin/mesos-tests.sh --gtest_filter="*SpeculativeOperationRacesWithUpdateSlaveMessage*" --gtest_repeat=-1 --gtest_break_on_failure` Thanks, Greg Mann