Re: Trying to get task reconciliation to work

Benjamin Mahler Thu, 17 Apr 2014 14:41:29 -0700

Good to see you were playing around with reconciliation, we should have
made the current semantics more clear. Especially in light of the fact that
it's not implemented fully until one uses a strict registrar (likely
0.20.0).

Think of reconciliation as the fallback mechanism to ensure that state is
consistent, it's not designed to be something to inform you of things you
were already told (in this case, that the tasks were running). Although we
could consider sending updates even when task state remains the same.

For the purpose of this conversation, let's say we're in the 0.20.0 world,
operating with the registrar. And let's assume your goal is to build a
highly available framework (I will be documenting how to do this for
0.20.0):

(1) *When you receive a status update, you must persist this information
before returning from the statusUpdate() callback*. Once you return from
the callback, the driver will acknowledge the slave directly. Slaves will
retry status update delivery *until* the acknowledgement is received from
the scheduler driver in order to ensure that the framework processed the
update.

(2) *When you receive a "slave lost" signal, it means that your tasks that
were running on that slave are in state TASK_LOST*, and any reconciliation
you perform for these tasks will result in a reply of TASK_LOST. Most of
the time we'll deliver these TASK_LOST automatically, but with a confluence
of Master *and* Slave failovers, we are unaware of which tasks were running
on the slave as we do not persist this information in the Master.

(3) To guarantee that you have a consistent view of task states. *You must
also periodically reconcile task state against the Master*. This is only
because the delivery of the "slave lost" signal in (2) is not reliable (the
Master could failover after removing a slave but before telling frameworks
that the slave was lost).

You'll notice that this model forces one to serially persist all status
update changes. We are planning to expose mechanisms to allow "batch"
acknowledgement of status updates in the lower-level API that benh has
given talks about. With a lower-level API, it is possible to build more
powerful libraries that hide much of these details!

You'll also perhaps notice that only (1) and (3) are strictly required for
consistency, but (2) is highly recommended as the vast majority of the time
the "slave lost" signal will be delivered and you can take action quickly,
without having to rely on periodic reconciliation.

Please let me know if anything here was not clear!

On Thu, Apr 17, 2014 at 1:47 PM, Sharma Podila <[email protected]> wrote:

> Should've looked at the code before sending the previous email...
>  master/main.cpp confirmed what I needed to know. It doesn't look like I
> will be able to use reconcileTasks the way I thought I could. Effectively,
> a lack of callback could either mean that the master agrees with the
> requested reconcile task state, or that the task and/or slave is currently
> unknown. Which makes it an unreliable source of data. I understand this is
> expected to improve later by leveraging the registrar, but, I suspect
> there's more to it.
>
> I take it then that individual frameworks need to have their own
> mechanisms to ascertain the state of their tasks.
>
>
> On Thu, Apr 17, 2014 at 12:53 PM, Sharma Podila <[email protected]>wrote:
>
>> Hello,
>>
>> I don't seem to have reconcileTasks() working for me and was wondering if
>> I am either using it incorrectly or hitting a problem. Here's what's
>> happening:
>>
>> 1. There's one Mesos (0.18) master, one slave, one framework, all running
>> on Ubuntu 12.04
>> 2. Mesos master and slave come up fine (using Zookeeper, but that isn't
>> relevant here, I'd think)
>> 3. My framework registers and gets offers
>> 4. Two tasks are launched, both start running fine on the single
>> available slave
>> 5. I restart my framework. During restart my framework knows that it had
>> previously launched two tasks that were last known to be in running state.
>> Therefore, upon getting the registered() callback, it calls
>> driver.reconcileTasks() for the two tasks. In actuality, the tasks are
>> still running fine. I see this in mesos master logs:
>>
>>     I0417 12:26:27.207361 27301 master.cpp:2154] Performing task state
>> reconciliation for framework MyFramework
>>
>> But, no other logs about reconciliation.
>>
>> 6. My framework gets no callback about status of tasks that it requested
>> reconciliation on.
>>
>> At this point, I am not sure if the lack of a callback for status update
>> is due to
>>   a) the fact that my framework asked for reconciliation on running
>> state, which Mesos also knows to be true, therefore, no status update
>>   b) Or, if the reconcile is not working. (hopefully this; reason (a)
>> would be problematic)
>>
>> So, I then proceed to another test:
>>
>> 7. kill my framework and mesos master
>> 8. Then, kill the slave (as an aside, this seems to have killed the tasks
>> as well)
>> 9. Restart mesos master
>> 10. Restart my framework. Now, again the reconciliation is requested.
>> 11. Still no callback.
>>
>> At this time, mesos master doesn't know about the slave because it hasn't
>> returned since master restarted.
>> What is the expected behavior for reconciliation under these
>> circumstances?
>>
>> 12. Restarted slave
>> 13. Killed and restarted my framework.
>> 14. Still no callback for reconciliation.
>>
>> Given these results, I can't see how reconciliation is working at all. I
>> did try this with Mesos 0.16 first and then upgraded to 0.18 to see if it
>> makes a difference.
>>
>> Thank you for any ideas on getting this resolved.
>>
>> Sharma
>>
>>
>

Re: Trying to get task reconciliation to work

Reply via email to