> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > Can we add some tests around the new refactor (e.g., race conditions 
> > mentioned in the review)? That would make us confident that we are not 
> > regressing here, because this is a significant change.

Hey Vinod - thanks a lot for looking at this!

But I think we just missed each other; I uploaded a new patch before you got to 
publish your review.
I will post a new patch based on your recent comments and ping you :)


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.hpp, line 207
> > <https://reviews.apache.org/r/18403/diff/5/?file=522998#file522998line207>
> >
> >     Why the change in return type?

Changed in recent review.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.hpp, line 438
> > <https://reviews.apache.org/r/18403/diff/5/?file=522998#file522998line438>
> >
> >     s/finalize/launched/ ?
> >     
> >     Also add a comment on what this method does.

Changed to launchCompleted() in new patch. Will add a comment, thanks! :)


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.hpp, line 447
> > <https://reviews.apache.org/r/18403/diff/5/?file=522998#file522998line447>
> >
> >     There are several places where executor state is checked and an 
> > appropriate action is taken. You should update them all with this state. I 
> > am not convinced that adding a new state is useful (or more trouble than 
> > its worth) here considering we have a future that tells whether it is 
> > launching or not.

I am not opposed to strip that out again. I introduced it based on previous 
comments and on the fact, the executor need to be in a pre-registering state 
when the executor info is unknown. Otherwise, the launching state is kinda 
implicit (based on executor->info state).

I'll go ahead and take a stab at removing LAUNCHING.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.hpp, line 460
> > <https://reviews.apache.org/r/18403/diff/5/?file=522998#file522998line460>
> >
> >     +1 to Ian's comment.

Changed in recent patch.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 2159
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line2159>
> >
> >     Maybe we should call this executorLaunched?

Good point - will change in upcoming patch.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 871
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line871>
> >
> >     Sending 'true' here seems a bit hacky.

Changed in recent review.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 916
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line916>
> >
> >     Lets just pass executorId to this method instead of using this.

Great point - thanks!


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 3154
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line3154>
> >
> >     Why not just
> >     
> >     executor->future = ... ?

Changed in recent patch.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 2342-2346
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line2342>
> >
> >     Why are these CHECKs?

Paranoia I suppose :)


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 3576-3588
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line3576>
> >
> >     So the checkpointing of executor info is now being done after it is 
> > launched. So if a slave restarts before finalize() gets called there is no 
> > way to recover this info and inform the master. This is probably ok if the 
> > master stays up because the state will be reconciled when the slave 
> > re-registers. If the master also fails over then all bets are off and no 
> > one knows about the lost task/executor. This is unfortunate but I guess no 
> > different than if the slave restarted when the task was launched. Lets add 
> > a test for this.

You bet - sounds like a great idea. I'll work on it.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 2206
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line2206>
> >
> >     Is this only called here? If yes, lets not pull this out.

Sure can do.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1546-1547
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line1546>
> >
> >     ditto. Why is this a check?

Paranoia :)


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.hpp, line 440
> > <https://reviews.apache.org/r/18403/diff/5/?file=522998#file522998line440>
> >
> >     Why do we need this?
> >     
> >     Why not just set Executor::id appropriately in the constructor?
> >

Good point!


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, line 785
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line785>
> >
> >     This should be moved to __runTask() to ensure framework doesn't get 
> > removed before __runTask() gets called.

Ah - good catch. Thanks!


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 860-871
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line860>
> >
> >     Why not just always do onAny(_runTask) irrespective of the state?
> >     
> >     More importantly the current way it is done could lead to running tasks 
> > out of order. For example, if the first task is defered due to a launch, 
> > and the future is set while a second task is in the master's queue then the 
> > second task will be launched before the first.

Good catch. Will change in upcoming patch.


> On March 19, 2014, 12:45 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 1055-1065
> > <https://reviews.apache.org/r/18403/diff/5/?file=522999#file522999line1055>
> >
> >     ditto. always enqueue on onAny.


- Niklas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18403/#review37753
-----------------------------------------------------------


On March 19, 2014, 12:46 p.m., Niklas Nielsen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18403/
> -----------------------------------------------------------
> 
> (Updated March 19, 2014, 12:46 p.m.)
> 
> 
> Review request for mesos, Ian Downes and Vinod Kone.
> 
> 
> Bugs: MESOS-922
>     https://issues.apache.org/jira/browse/MESOS-922
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This patch delegates the choice of executor to the containerizer by removing 
> executorInfo dependencies up until Containerizer::launch().
> Containerizer::launch() now returns a future to the executor info that is 
> being run and the slave creates the corresponding executor structure when 
> launch completes.
> This means message handling from the running executor to the slave in the 
> interim where the executor structure has not created, need to be enqueued 
> until executor is ready. So far, registerExecutor() and reregisterExecutor() 
> has been split into two continuations to deal with this issue.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/containerizer.hpp d9ae326 
>   src/slave/containerizer/mesos_containerizer.hpp ee1fd30 
>   src/slave/containerizer/mesos_containerizer.cpp c819c97 
>   src/slave/http.cpp 594032d 
>   src/slave/slave.hpp 01b80df 
>   src/slave/slave.cpp d8d3e0f 
>   src/tests/containerizer.hpp 5686398 
>   src/tests/containerizer.cpp bfb9341 
> 
> Diff: https://reviews.apache.org/r/18403/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>

Reply via email to