Benjamin Mahler created MESOS-1799:
--------------------------------------

             Summary: Reconciliation can send out-of-order updates.
                 Key: MESOS-1799
                 URL: https://issues.apache.org/jira/browse/MESOS-1799
             Project: Mesos
          Issue Type: Bug
          Components: master, slave
            Reporter: Benjamin Mahler


When a slave re-registers with the master, it currently sends the latest task 
state for all tasks that are not both terminal and acknowledged.

However, reconciliation assumes that we always have the latest unacknowledged 
state of the task represented in the master.

As a result, out-of-order updates are possible, e.g.

(1) Slave has task T in TASK_FINISHED, with unacknowledged updates: 
[TASK_RUNNING, TASK_FINISHED].
(2) Master fails over.
(3) New master re-registers the slave with T in TASK_FINISHED.
(4) Reconciliation request arrives, master sends TASK_FINISHED.
(5) Slave sends TASK_RUNNING to master, master sends TASK_RUNNING.

I think the fix here is to preserve the task state invariants in the master, 
namely, that the master has the latest unacknowledged state of the task. This 
means when the slave re-registers, it should instead send the latest 
unacknowledged state of each task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to