[
https://issues.apache.org/jira/browse/MESOS-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312732#comment-14312732
]
Ian Downes commented on MESOS-1251:
-----------------------------------
This is still an issue. Specifically, at least the executor registration
timeout is started from when the launch is initiated, not from when the launch
succeeds. This will incorrectly cause timeouts if, for example, fetching the
executor or container image, exceeds the timeout or doesn't leave sufficient
time for the executor to start and register.
> Slave should make sure that the containerizer::launch returned Future is ready
> ------------------------------------------------------------------------------
>
> Key: MESOS-1251
> URL: https://issues.apache.org/jira/browse/MESOS-1251
> Project: Mesos
> Issue Type: Improvement
> Reporter: Till Toenshoff
> Assignee: Ian Downes
> Priority: Minor
> Labels: concurrency, containerizer, order, slave, twitter
>
> Currently the slave is not awaiting the {{Future<Nothing>}} returned by
> {{Containerizer::Launch}} before sending out more command events.
> Is there a reason for this behavior?
> This issue becomes apparent only when having a launch-command-implementations
> that is relatively "expensive".
> So what I can see here is the following chain of events along a vertical time
> axis:
> {noformat}
> Launch
> |
> | Wait
> | |
> | | Update
> | | |
> ----------Launch Future<Nothing> became ready
> {noformat}
> What I would like to see is:
> {noformat}
> Launch
> |
> |
> |
> |
> |
> ----------Launch Future<Nothing> became ready
> Wait
> |
> | Update
> | |
> {noformat}
> As we are currently pushing the former behavior into the implementation of
> the containerizer, things quickly get rather complicated on that side. Hence
> I would like to understand if that is something we really want / need or if
> we might want to fix this within the slave in a longer run.
> So far, I have only observed this to be a challenge for {{Launch}}, but other
> events might just as well be worth a thought on enforced chaining instead of
> concurrent invocations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)