> On Nov. 4, 2015, 2:43 a.m., Ben Mahler wrote:
> > src/master/master.cpp, lines 6004-6013
> > <https://reviews.apache.org/r/39873/diff/2/?file=1114216#file1114216line6004>
> >
> >     Just looking at the patch that introduced this bug, why are we removing 
> > the out-of-order update prevention? Don't see any mention of this in the 
> > stuff around https://issues.apache.org/jira/browse/MESOS-2864.

There are couple reasons for this

1) it's hard to figure out if an update is an out of order update. for example, 
if TASK_STARTING, TASK_RUNNING and TASK_FINISHED are all enqued on the slave, 
the master might receive non-terminal (TASK_RUNNING) and terminal 
(TASK_FINISHED) after it transitioned the task to terminal state (TASK_STARTING 
update w/ latest state as TASK_FINISHED).

2) when updateTask() is called, typically an update is being forwarded to the 
scheduler. so while it might be ok to not transition the (latest) state of the 
task, it is important to set the task.status_update_state and 
task.status_update_uuid correctly. so we shouldn't short-circuit the function 
without setting these.

makes sense?


> On Nov. 4, 2015, 2:43 a.m., Ben Mahler wrote:
> > src/master/master.cpp, line 6039
> > <https://reviews.apache.org/r/39873/diff/2/?file=1114216#file1114216line6039>
> >
> >     Why not have the same logic here to be defensive? Or do you intend to 
> > guard against it? Just seems weird to allow it in one of the cases but not 
> > the other.

i didn't do it in the master's case because it would be a bug in the code if 
the master is trying to call updateTask() for an already terminated task! we 
have to do it for updates from slave because we don't control the executor's 
updates. i could make it a CHECK for the master case though.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39873/#review105027
-----------------------------------------------------------


On Nov. 2, 2015, 10:34 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39873/
> -----------------------------------------------------------
> 
> (Updated Nov. 2, 2015, 10:34 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-2864
>     https://issues.apache.org/jira/browse/MESOS-2864
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The master now doesn't change the latest state of a task if it has already 
> terminated. But it still updates the status update state and uuid.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 2bc5a97a5b50c8a8a9902c47b2e9e3b5216d97ea 
> 
> Diff: https://reviews.apache.org/r/39873/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> Ran the "RecoverCompletedExecutor" test 100 times as it was failing most of 
> the time without this fix.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>

Reply via email to