We want reconciliation to be a process that eventually terminates.

In <= 0.19.0, the following two cases are conflated through no update being
sent:
  (1) No state difference.
  (2) Master temporarily cannot reply / dropped message.

As a result, a scheduler cannot determine when it is finished reconciling
(is my state correct? or was my message not processed?).


The way we want to steer frameworks to use reconciliation is as follows.

(1) You should only need to reconcile with the master after a
re-registration occurs (either master failed over, or framework failed
over). Some frameworks may want to be more defensive against both
themselves and against Mesos, and may reconcile on a periodic basis (e.g.
hourly or daily).

(2) Reconciliation is a process which terminates when an update has been
received for each task. Here is some pseudo-code that demonstrates how a
scheduler would implement reliable reconciliation:


    # Reconciles state against the master.
    # TODO: If you call this twice, it will start two reconciliation
cycles, instead of starting a new one.
    def reconcile():
      start_time = now()
      remaining_tasks = [all non terminal tasks]
      driver.reconcileTasks() # Implicit reconciliation, lets you discover
unknown tasks.
      delay(Seconds(30), _reconcile, start_time, remaining_tasks)

    def _reconcile(start_time, remaining_tasks):
      remaining_tasks = [t for t in remaining_tasks if
t.latest_update_time() < start_time]
      if not remaining_tasks.empty():
        driver.reconcileTasks(remaining_tasks)
        delay(Seconds(30), _reconcile, start_time, remaining_tasks) # TODO:
Use backoff instead.


The idea is that you reconcile a set of tasks until you receive updates for
each one, this set will converge to become empty. You would call
reconcile() when a (re-)registration occurs.

This is the model you should use in 0.20.0, there are some edge cases that
we'll fix for 0.21.0, but you likely will not notice them:
https://issues.apache.org/jira/browse/MESOS-1407

Appreciate you starting this thread. Let me know if anything is not clear.

Ben

On Sun, Sep 28, 2014 at 11:58 AM, Whitney Sorenson <[email protected]>
wrote:

> I'm trying to understand the changes in
> https://issues.apache.org/jira/browse/MESOS-1453 and the SchedulerDriver
> JavaDoc.
>
> In the 0.19 behavior, it made sense to me that a framework would hold onto
> a copy of all the latest task statuses it knew about, and could poll
> reconcileTasks with these statuses in order to request delivery of any lost
> messages (covering the case of both the framework being absent for a while
> or just a general loss of messages.)
>
> Is the idea behind the changes in 0.20 that a framework now need only call
> reconcileTasks once after registering with a master? In that case, what is
> the use case for having the API still take a list of taskStatus objects -
> so frameworks can decide that they don't want to know about unknown tasks
> [1379]? If frameworks should still routinely ask for missing messages -
> then why bother sending all updates and causing the framework to have to
> handle the work of routinely ignoring duplicate status updates?
>
> Thanks,
>
> -Whitney
>
>

Reply via email to