Thanks for writing this up Ben! I have a couple suggestions about additional 
details that could be helpful to explain.

First, could you go a little more in-depth about how this process works for 
terminated tasks? For example, how does reconciliation behave for tasks running 
on a slave that has become disconnected from the master? An overview of the 
various timeouts involved would also be really awesome.

Second, what happens when a framework attempts to reconcile a task that is 
completely unknown to Mesos? An example scenario could be that a task died, the 
terminal status update was ACKed, but the scheduler failed over before this 
information could be persisted. What task status (if any) does Mesos respond 
with?
--
Connor Doyle
http://mesosphere.io


On Oct 15, 2014, at 14:05, Benjamin Mahler <benjamin.mah...@gmail.com> wrote:

> Hi all,
> 
> I've sent a review out for a document describing reconciliation, you can see 
> the draft here:
> https://gist.github.com/bmahler/18409fc4f052df43f403
> 
> Would love to gather high level feedback on it from framework developers. 
> Feel free to reply here, or on the review:
> https://reviews.apache.org/r/26669/
> 
> Thanks!
> Ben

Reply via email to