[ 
https://issues.apache.org/jira/browse/MESOS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707362#comment-16707362
 ] 

Benjamin Bannier commented on MESOS-9318:
-----------------------------------------

The flow for a possible fix could be:
* master sees a reconcilation request an operation on some resource provider on 
a registered agent
* master forwards reconcilation request to agent
* agent forwards it to its resource provider manager
* resource provider manager either sends a {{ReconcileOperations}} event to the 
registered resource provider, or responds with an {{OPERATION_UNREACHABLE}} for 
a resource provider which is not subscribed. It could also respond with some 
status for resource providers marked gone, see MESOS-8403.

> Consider providing better operation status updates while an RP is recovering
> ----------------------------------------------------------------------------
>
>                 Key: MESOS-9318
>                 URL: https://issues.apache.org/jira/browse/MESOS-9318
>             Project: Mesos
>          Issue Type: Task
>    Affects Versions: 1.6.0, 1.7.0
>            Reporter: Gastón Kleiman
>            Priority: Major
>              Labels: mesosphere, operation-feedback
>
> Consider the following scenario:
> 1. A framework accepts an offer with an operation affecting SLRP resources.
> 2. The master forwards it to the corresponding agent.
> 3. The agent forwards it to the corresponding RP.
> 4. The agent and the master fail over.
> 5. The master recovers.
> 6. The agent recovers while the RP is still recovering, so it doesn't include 
> the pending operation on the {{RegisterMessage}}.
> 7. A framework performs an explicit operation status reconciliation.
> In this case the master will currently respond with {{OPERATION_UNKNOWN}}, 
> but it should be possible to respond with a more fine-grained and useful 
> state, such as {{OPERATION_RECOVERING}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to