[
https://issues.apache.org/jira/browse/MESOS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707362#comment-16707362
]
Benjamin Bannier commented on MESOS-9318:
-----------------------------------------
The flow for a possible fix could be:
* master sees a reconcilation request an operation on some resource provider on
a registered agent
* master forwards reconcilation request to agent
* agent forwards it to its resource provider manager
* resource provider manager either sends a {{ReconcileOperations}} event to the
registered resource provider, or responds with an {{OPERATION_UNREACHABLE}} for
a resource provider which is not subscribed. It could also respond with some
status for resource providers marked gone, see MESOS-8403.
> Consider providing better operation status updates while an RP is recovering
> ----------------------------------------------------------------------------
>
> Key: MESOS-9318
> URL: https://issues.apache.org/jira/browse/MESOS-9318
> Project: Mesos
> Issue Type: Task
> Affects Versions: 1.6.0, 1.7.0
> Reporter: Gastón Kleiman
> Priority: Major
> Labels: mesosphere, operation-feedback
>
> Consider the following scenario:
> 1. A framework accepts an offer with an operation affecting SLRP resources.
> 2. The master forwards it to the corresponding agent.
> 3. The agent forwards it to the corresponding RP.
> 4. The agent and the master fail over.
> 5. The master recovers.
> 6. The agent recovers while the RP is still recovering, so it doesn't include
> the pending operation on the {{RegisterMessage}}.
> 7. A framework performs an explicit operation status reconciliation.
> In this case the master will currently respond with {{OPERATION_UNKNOWN}},
> but it should be possible to respond with a more fine-grained and useful
> state, such as {{OPERATION_RECOVERING}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)