Neil Conway created MESOS-6206:
----------------------------------

             Summary: Change reconciliation to return results for in-progress 
removals and reregistrations
                 Key: MESOS-6206
                 URL: https://issues.apache.org/jira/browse/MESOS-6206
             Project: Mesos
          Issue Type: Bug
          Components: master
            Reporter: Neil Conway
            Assignee: Neil Conway


The master does not return any reconciliation results for agents it views as 
"transitioning". An agent is defined as transitioning if any of the following 
are true:

1. The master recovered from the registry after failover but the agent has not 
yet reregistered
2. The master is in the process of removing an admitted agent from the registry
3. The master is in the process of re-registering an agent (i.e., re-adding it 
to the list of admitted agents).

I think case #1 makes sense but cases #2 and #3 do not. Before the registry 
operation completes, we should instead view the slave as still being in its 
previous state ("admitted" for case 2 and not-admitted/unreachable/etc. for 
case 3).

Reasons to make this change:
1. Improve consistency with output of endpoints, etc.: until the registry 
operation to remove/re-admit a slave finishes, we show the previous state of 
the slave in the HTTP endpoints. Returning reconciliation results that are 
consistent with HTTP endpoint values is sensible.
2. It is simpler. Rather than not sending anything to frameworks and requiring 
that they ask us again later, it is simpler to just send the current state of 
the agent. If that state changes (whether due to the registry operation 
succeeding or a subsequent state change), then the reconciliation results might 
be stale -- so be it. Such stale information fundamentally cannot be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to