[ 
https://issues.apache.org/jira/browse/MESOS-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919879#comment-16919879
 ] 

Greg Mann commented on MESOS-9957:
----------------------------------

One approach that could be taken here is to eliminate the per-executor 
{{Sequence}}s in the {{taskLaunchSequences}} map, and instead put a single 
{{Sequence operationSequence}} member in the {{Framework}} struct. The 
{{taskLaunch}} futures from the {{run()}} code path could likely be added into 
that sequence as-is, with the {{applyOperation()}} code path adding new futures 
to that sequence as well.

> Sequence all operations on the agent
> ------------------------------------
>
>                 Key: MESOS-9957
>                 URL: https://issues.apache.org/jira/browse/MESOS-9957
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Greg Mann
>            Priority: Major
>              Labels: foundations, mesosphere
>
> The resolution of MESOS-8582 requires that an asynchronous step be added to 
> the code path which applies speculative operations like RESERVE and CREATE on 
> the agent. In order to ensure that the {{FrameworkInfo}} associated with an 
> incoming operation will be successfully retained, we must first unschedule GC 
> on the framework meta directory if the framework struct does not exist but 
> that directory does. By introducing this asynchronous step, we allow the 
> possibility that an operation may be executed out-of-order with respect to an 
> incoming dependent LAUNCH or LAUNCH_GROUP.
> For example, if a scheduler issues an ACCEPT call containing both a RESERVE 
> operation  as well as a LAUNCH operation containing a task which consumes the 
> new reserved resources, it's possible that this task will be launched on the 
> agent before the reserved resources exist.
> While we already [sequence task launches on a per-executor 
> basis|https://github.com/apache/mesos/blob/9297e2d3b0d44b553fc89bcf5f6109c76cc53668/src/slave/slave.cpp#L2337-L2408],
>  the aforementioned corner case requires that we sequence _all_ offer 
> operations on a per-framework basis.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to