> On Jan. 2, 2015, 10:23 p.m., Ben Mahler wrote:
> > What happened previously if a launch takes forever?
> > 
> > What happens now if a launch takes forever?
> 
> Timothy Chen wrote:
>     Ben that's a good point, previously the registration timeout gates the 
> launch not to take forever, however it is too coarse of a timeout as it 
> included the time to launch and also the time it take for the executor to 
> register after launch.
>     
>     Now it will only time out on the latter case, I think we do need a 
> seperate timeout that checks for the launch itself. Ben what do you think?
> 
> Nishant Suneja wrote:
>     I concur with Timothy. I think having a separate timeout to track the 
> container launch itself makes sense.
> 
> Timothy Chen wrote:
>     Hi Nishant, can you also add another timer for gating the launch? Once we 
> have both PRs then I think we are in a better state to merge both.
> 
> Tom Arnfeld wrote:
>     Just caught wind of this change and it's good to see! Going to be very 
> useful over here. Can I request that there's also a command line flag for the 
> slave to change the launch timeout? :-)
> 
> Nishant Suneja wrote:
>     @Timothy: For launch timeout, if the launch doesnt happen within the 
> specified timeout duration, what action would we want to take ? We could 
> possibly use the same callback "registerExecutorTimeout", which effectively 
> moves the executor to TERMINATING state (for gc later on) and calls the 
> containerizer destroy for that container.
>     
>     Also, what default value would we want for this timeout ?
>     
>     
>     @Tom: yeah..I think its a good idea. I can add that.

I think we want to perform the same action (registerExecutorTimeout) actually. 
If it's going to be shared, I think we should rename so it's not executor 
registration specific.
The other part is to make sure the timeout is cancelled when launch is finished 
before issuing the next delay call.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29437/#review66557
-----------------------------------------------------------


On Dec. 31, 2014, 11:57 p.m., Nishant Suneja wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29437/
> -----------------------------------------------------------
> 
> (Updated Dec. 31, 2014, 11:57 p.m.)
> 
> 
> Review request for mesos and Timothy Chen.
> 
> 
> Bugs: MESOS-999
>     https://issues.apache.org/jira/browse/MESOS-999
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As part of this bug fix, I have trigerred the executor registration timeout 
> timer after the container's future object is set, instead of starting the 
> timer when the container launch is still pending
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 50b57819b55bdcdb9f49f20648199badc4d3f37b 
>   src/tests/composing_containerizer_tests.cpp 
> 5ab5a36cadb7f8622bad0c5814e9a5fb338753ad 
>   src/tests/containerizer.hpp 24b014f44d9eec56840e18cf39fbf9100f2c0711 
>   src/tests/slave_tests.cpp f2896a1fc4521452e29fd261a6f117372345dcfc 
> 
> Diff: https://reviews.apache.org/r/29437/diff/
> 
> 
> Testing
> -------
> 
> Added the unit test : SlaveTest::ExecutorRegistrationTimeoutTrigger
> make check succeeds.
> 
> 
> Thanks,
> 
> Nishant Suneja
> 
>

Reply via email to