Repository: mesos Updated Branches: refs/heads/master 351829123 -> 52660fe6a
Handled race condition when removing maintenance windows. When executing the `Master::inverseOffers()` callback, it could happen that the maintenance window the inverse offer referred to was already removed by a concurrent call to to the maintenance endpoint of Mesos. In this case, we must not send out an inverse offer, because having outstanding inverse offers for an agent without any scheduled maintenance window will lead to a crash in the allocator when attempting to remove this offer. Review: https://reviews.apache.org/r/67403/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/52660fe6 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/52660fe6 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/52660fe6 Branch: refs/heads/master Commit: 52660fe6ac7a205e168f3e03cddbff6e7c0de813 Parents: 3518291 Author: Benno Evers <[email protected]> Authored: Mon Jun 4 11:29:49 2018 -0700 Committer: Joseph Wu <[email protected]> Committed: Mon Jun 4 11:29:49 2018 -0700 ---------------------------------------------------------------------- src/master/master.cpp | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/52660fe6/src/master/master.cpp ---------------------------------------------------------------------- diff --git a/src/master/master.cpp b/src/master/master.cpp index f778e48..5db5a8d 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -9456,8 +9456,20 @@ void Master::inverseOffer( // before the slave was deactivated in the allocator. if (!slave->active) { LOG(INFO) - << "Master ignoring inverse offers because agent " << *slave - << " is " << (slave->connected ? "deactivated" : "disconnected"); + << "Master ignoring inverse offers to framework " << *framework + << " because agent " << *slave << " is " + << (slave->connected ? "deactivated" : "disconnected"); + + continue; + } + + // This could happen if the allocator dispatched `Master::inverseOffer` + // before the unavailability was removed in the master. + if (!machines.contains(slave->machineId) || + !machines.at(slave->machineId).info.has_unavailability()) { + LOG(INFO) + << "Master dropping inverse offers to framework " << *framework + << " because agent " << *slave << " had its unavailability revoked."; continue; }
