Neil Conway created MESOS-6206: ---------------------------------- Summary: Change reconciliation to return results for in-progress removals and reregistrations Key: MESOS-6206 URL: https://issues.apache.org/jira/browse/MESOS-6206 Project: Mesos Issue Type: Bug Components: master Reporter: Neil Conway Assignee: Neil Conway
The master does not return any reconciliation results for agents it views as "transitioning". An agent is defined as transitioning if any of the following are true: 1. The master recovered from the registry after failover but the agent has not yet reregistered 2. The master is in the process of removing an admitted agent from the registry 3. The master is in the process of re-registering an agent (i.e., re-adding it to the list of admitted agents). I think case #1 makes sense but cases #2 and #3 do not. Before the registry operation completes, we should instead view the slave as still being in its previous state ("admitted" for case 2 and not-admitted/unreachable/etc. for case 3). Reasons to make this change: 1. Improve consistency with output of endpoints, etc.: until the registry operation to remove/re-admit a slave finishes, we show the previous state of the slave in the HTTP endpoints. Returning reconciliation results that are consistent with HTTP endpoint values is sensible. 2. It is simpler. Rather than not sending anything to frameworks and requiring that they ask us again later, it is simpler to just send the current state of the agent. If that state changes (whether due to the registry operation succeeding or a subsequent state change), then the reconciliation results might be stale -- so be it. Such stale information fundamentally cannot be avoided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)