[
https://issues.apache.org/jira/browse/MESOS-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kone updated MESOS-4712:
------------------------------
Comment: was deleted
(was: test)
> Remove 'force' field from the Subscribe Call in v1 Scheduler API
> ----------------------------------------------------------------
>
> Key: MESOS-4712
> URL: https://issues.apache.org/jira/browse/MESOS-4712
> Project: Mesos
> Issue Type: Task
> Reporter: Vinod Kone
> Assignee: Vinod Kone
> Fix For: 0.28.0
>
>
> We/I introduced the `force` field in SUBSCRIBE call to deal with scheduler
> partition cases. Having thought a bit more and discussing with few other
> folks ([~anandmazumdar], [~greggomann]), I think we can get away from not
> having that field in the v1 API. The obvious advantage of removing the field
> is that framework devs don't have to think about how/when to set the field
> (the current semantics are a bit confusing).
> The new workflow when a master receives a SUBSCRIBE call is that master
> always accepts this call and closes any existing connection (after sending
> ERROR event) from the same scheduler (identified by framework id).
> The expectation from schedulers is that they must close the old subscribe
> connection before resending a new SUBSCRIBE call.
> Lets look at some tricky scenarios and see how this works and why it is safe.
> 1) Connection disconnection @ the scheduler but not @ the master
>
> Scheduler sees the disconnection and sends a new SUBSCRIBE call. Master sends
> ERROR on the old connection (won't be received by the scheduler because the
> connection is already closed) and closes it.
> 2) Connection disconnection @ master but not @ scheduler
> Scheduler realizes this from lack of HEARTBEAT events. It then closes its
> existing connection and sends a new SUBSCRIBE call. Master accepts the new
> SUBSCRIBE call. There is no old connection to close on the master as it is
> already closed.
> 3) Scheduler failover but no disconnection @ master
> Newly elected scheduler sends a SUBSCRIBE call. Master sends ERROR event and
> closes the old connection (won't be received because the old scheduler failed
> over).
> 4) If Scheduler A got partitioned (but is alive and connected with master)
> and Scheduler B got elected as new leader.
> When Scheduler B sends SUBSCRIBE, master sends ERROR and closes the
> connection from Scheduler A. Master accepts Scheduler B's connection.
> Typically Scheduler A aborts after receiving ERROR and gets restarted. After
> restart it won't become the leader because Scheduler B is already elected.
> 5) Scheduler sends SUBSCRIBE, times out, closes the SUBSCRIBE connection (A)
> and sends a new SUBSCRIBE (B). Master receives SUBSCRIBE (B) and then
> receives SUBSCRIBE (A) but doesn't see A's disconnection yet.
> Master first accepts SUBSCRIBE (B). After it receives SUBSCRIBE (A), it sends
> ERROR to SUBSCRIBE (B) and closes that connection. When it accepts SUBSCRIBE
> (A) and tries to send SUBSCRIBED event the connection closure is detected.
> Scheduler retries the SUBSCRIBE connection after a backoff. I think this is a
> rare enough race for it to happen continuously in a loop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)