[ 
https://issues.apache.org/jira/browse/MESOS-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-6090:
-------------------------------
    Description: 
When a new slave attempts to register, the registry is updated first, then the 
master's in-memory state is updated if the registry operation is applied 
successfully. However, when a slave is removed or marked unreachable, the 
master first updates its in-memory state, then updates the registry. This has 
two problems:

1. It makes it harder to reason about the correctness of concurrent operations 
that read in-memory state and update the registry.
2. It can leak incorrect information via the HTTP endpoints. That is, if we 
update the master's in-memory state on removal or marking a slave unreachable, 
that change will be observable via the HTTP endpoints. If the master then fails 
over (and the registry operation fails), the information returned via the 
endpoint will be incorrect. The master has special code to avoid this 
inaccuracy for reconciliation (see {{Master::transitioning()}}), but not for 
the endpoints.

I think it is simpler to just always update the registry first.

  was:
When a new slave attempts to register, the registry is updated first, then the 
master's in-memory state is updated if the registry operation is applied 
successfully. However, when a slave is removed or reregisters, the master first 
updates its in-memory state, then updates the registry. This has two problems:

1. It makes it harder to reason about the correctness of concurrent operations 
that read in-memory state and update the registry.
2. It can leak incorrect information via the HTTP endpoints. That is, if we 
update the master's in-memory state on removal or reregistration, that change 
will be observable via the HTTP endpoints. If the master then fails over (and 
the registry operation fails), the information returned via the endpoint will 
be incorrect. The master has special code to avoid this inaccuracy for 
reconciliation (see {{Master::transitioning()}}), but not for the endpoints.

I think it is simpler to just always update the registry first.


> Change master to always update registry before in-memory state
> --------------------------------------------------------------
>
>                 Key: MESOS-6090
>                 URL: https://issues.apache.org/jira/browse/MESOS-6090
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Neil Conway
>            Assignee: Neil Conway
>              Labels: mesosphere
>
> When a new slave attempts to register, the registry is updated first, then 
> the master's in-memory state is updated if the registry operation is applied 
> successfully. However, when a slave is removed or marked unreachable, the 
> master first updates its in-memory state, then updates the registry. This has 
> two problems:
> 1. It makes it harder to reason about the correctness of concurrent 
> operations that read in-memory state and update the registry.
> 2. It can leak incorrect information via the HTTP endpoints. That is, if we 
> update the master's in-memory state on removal or marking a slave 
> unreachable, that change will be observable via the HTTP endpoints. If the 
> master then fails over (and the registry operation fails), the information 
> returned via the endpoint will be incorrect. The master has special code to 
> avoid this inaccuracy for reconciliation (see {{Master::transitioning()}}), 
> but not for the endpoints.
> I think it is simpler to just always update the registry first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to