[
https://issues.apache.org/jira/browse/MESOS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196870#comment-16196870
]
Benjamin Bannier edited comment on MESOS-8058 at 10/10/17 9:15 PM:
-------------------------------------------------------------------
Review: https://reviews.apache.org/r/62868/
was (Author: bbannier):
Reviews:
https://reviews.apache.org/r/62843/
https://reviews.apache.org/r/62834/
https://reviews.apache.org/r/62847/
> Agent and master can race when updating agent state
> ---------------------------------------------------
>
> Key: MESOS-8058
> URL: https://issues.apache.org/jira/browse/MESOS-8058
> Project: Mesos
> Issue Type: Bug
> Components: agent
> Affects Versions: 1.5.0
> Reporter: Benjamin Bannier
> Assignee: Benjamin Bannier
> Priority: Critical
> Labels: mesosphere
>
> In {{2af9a5b07dc80151154264e974d03f56a1c25838}} we introduce the use of
> {{UpdateSlaveMessage}} for the agent to inform the master about its current
> total resources. Currently we trigger this message only on agent registration
> and reregistration.
> This can race with operations applied in the master and communicated via
> {{CheckpointResourcesMessage}}.
> Example:
> 1. Agent ({{cpus:4(\*)}} registers.
> 2. Master is triggered to apply an operation to the agent's resources, e.g.,
> a reservation: {{cpus:4(\*) -> cpus:4(A)}}. The master applies the operation
> to its current view of the agent's resources and sends the agent a
> {{CheckpointResourcesMessage}} so the agent can persist the result.
> 3. The agent sends the master an {{UpdateSlaveMessage}}, e.g., {{cpus:4(\*)}}
> since it hasn't received the {{CheckpointResourcesMessage}} yet.
> 4. The master processes the {{UpdateSlaveMessage}} and updates its view of
> the agent's resources to be {{cpus:4(\*)}}.
> 5. The agent processes the {{CheckpointResourcesMessage}} and updates its
> view of its resources to be {{cpus:4(A)}}.
> 6. The agent and the master have an inconsistent view of the agent's
> resources.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)