[
https://issues.apache.org/jira/browse/MESOS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165464#comment-14165464
]
Vinod Kone commented on MESOS-1799:
-----------------------------------
Here is my proposal on how to do this (h/t [~bmahler] for participating in
discussions).
Problem:
-------------
The "state of the task" could be different according to different components of
Mesos (framework, master, slave).
Framework: It is the latest unacknowledged state sent by the SUM (via master)
Master:
1) In steady state, it is the latest unacked state sent by the SUM
2) On slave re-registration it is the latest state known to the slave (not
necessarily the latest unacked state)
Slave: It is the latest state reported by the executor (not necessarily the
latest unacked state).
Due to these discrepancies we have issues like MESOS-1799 (this ticket) and
MESOS-1817 (linked ticket).
Solution
-----------
The high level idea is that we want the master (and its UI) and slave (and its
UI) to reflect the latest state of the task, *irrespective* of the
unacknowledged state, i.e., what framework knows. As far as framework is
concerned, it will always know the latest unacked state, because we want to
maintain in-order delivery of updates for a task.
Details
----------
--> Master keeps 2 bits of information about a Task. Latest state and Latest
unacked state.
--> SUM relays this information (latest and unacked state) when sending an
update. It already knows this from its queues.
--> When a slave re-registers, the slave first asks the SUM about the latest
unacked update for every task that has pending updates. We will add a new API
method to SUM. It then includes both the latest state (already known to the
slave) and the unacked state (got via SUM) in its Reregister message.
--> This guarantees that during steady state or master failover or slave
re-registration, master always correctly knows the latest state and unacked
state of the task.
--> Whenever a reconciliation request comes to the master, master informs the
framework with the latest unacked state *irrespective* of the latest state.
This guarantees there are no out of order updates.
--> Master and slave will take actions based on the latest state of the task,
e.g., free up resources when the latest state of the task is terminal.
I believe this should should solve both the issues raised in this ticket and
the linked ticket.
Thoughts?
> Reconciliation can send out-of-order updates.
> ---------------------------------------------
>
> Key: MESOS-1799
> URL: https://issues.apache.org/jira/browse/MESOS-1799
> Project: Mesos
> Issue Type: Bug
> Components: master, slave
> Reporter: Benjamin Mahler
> Assignee: Vinod Kone
>
> When a slave re-registers with the master, it currently sends the latest task
> state for all tasks that are not both terminal and acknowledged.
> However, reconciliation assumes that we always have the latest unacknowledged
> state of the task represented in the master.
> As a result, out-of-order updates are possible, e.g.
> (1) Slave has task T in TASK_FINISHED, with unacknowledged updates:
> [TASK_RUNNING, TASK_FINISHED].
> (2) Master fails over.
> (3) New master re-registers the slave with T in TASK_FINISHED.
> (4) Reconciliation request arrives, master sends TASK_FINISHED.
> (5) Slave sends TASK_RUNNING to master, master sends TASK_RUNNING.
> I think the fix here is to preserve the task state invariants in the master,
> namely, that the master has the latest unacknowledged state of the task. This
> means when the slave re-registers, it should instead send the latest
> acknowledged state of each task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)