Repository: mesos Updated Branches: refs/heads/master 53480e907 -> b4372acb1
Fixed handling of terminal operations in `updateSlave` handler. An offer operation can be become terminal between any previously received non-terminal offer operation status update and receiving an `UpdateSlaveMessage` (e.g., if the agent failed over, or when the agent was partitioned from the master). The master will in its offer operations status handler attempt to apply operations which became terminal since the last update. At the same time, the total resources in an `UpdateSlaveMessage` would already contain the result of applying the operation, and we need to prevent the master from attempting to apply the same operation twice. This patch updates the master handler for `UpdateSlaveMessage` to transition pending operations which are reported as terminal without also updating the resources on the agent as any update would already be reflected in the new total from the `UpdateSlaveMessage. Review: https://reviews.apache.org/r/65072/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b4372acb Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b4372acb Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b4372acb Branch: refs/heads/master Commit: b4372acb1133973e605c9c618e144ba3f4f74c16 Parents: 24c7104 Author: Benjamin Bannier <benjamin.bann...@mesosphere.io> Authored: Fri Jan 12 15:40:46 2018 -0800 Committer: Greg Mann <gregorywm...@gmail.com> Committed: Fri Jan 12 16:53:50 2018 -0800 ---------------------------------------------------------------------- src/master/master.cpp | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/b4372acb/src/master/master.cpp ---------------------------------------------------------------------- diff --git a/src/master/master.cpp b/src/master/master.cpp index a207819..b0ec565 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -7583,6 +7583,39 @@ void Master::updateSlave(UpdateSlaveMessage&& message) ->CopyFrom(providerId.get()); } } + + // If a known operation became terminal between any previous offer + // operation status update and this `UpdateSlaveMessage`, the total + // resources we were sent already had the operation applied. We need + // to update the state of the operation to terminal here so that any + // update sent by the agent later does not cause us to apply the + // operation again. + if (provider.newOperations.isSome() && + provider.newOperations->contains(uuid)) { + Option<Operation> oldOperation = provider.oldOperations->get(uuid); + Option<Operation> newOperation = provider.newOperations->get(uuid); + + CHECK_SOME(oldOperation); + CHECK_SOME(newOperation); + + if (!protobuf::isTerminalState( + oldOperation->latest_status().state()) && + protobuf::isTerminalState( + newOperation->latest_status().state())) { + Operation* operation = CHECK_NOTNULL(slave->getOperation(uuid)); + + UpdateOperationStatusMessage update = + protobuf::createUpdateOperationStatusMessage( + uuid, + newOperation->latest_status(), + newOperation->latest_status(), + operation->framework_id(), + operation->slave_id()); + + updateOperation( + operation, update, false); // Do not update resources. + } + } } }