Alexander Rukletsov created MESOS-9277:
------------------------------------------

             Summary: UNRESERVE scheduler call be dropped if it loses the race 
with TEARDOWN. 
                 Key: MESOS-9277
                 URL: https://issues.apache.org/jira/browse/MESOS-9277
             Project: Mesos
          Issue Type: Bug
          Components: scheduler api
    Affects Versions: 1.7.0, 1.6.1, 1.5.1
            Reporter: Alexander Rukletsov


A typical use pattern for a framework scheduler is to remove its reservations 
before tearing itself down. However, it is racy: {{UNRESERVE}} is a multi-stage 
action which aborts if the framework is removed in-between.

*Solution 1*
Let schedulers use operation feedback and expect them to wait for an ack for 
{{UNRESERVE}} before they send {{TEARDOWN}}. Kind of science fiction with a 
timeline of {{O(months)}} and still possibilities for the race if a scheduler 
does not comply.

*Solution 2*
Serialize calls for schedulers. For example, we can chain [handlers 
here|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/src/master/http.cpp#L640-L711]
 onto per-{{Master::Framework}} 
[{{process::Sequence}}|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/3rdparty/libprocess/include/process/sequence.hpp].
 For that however, handlers must provide futures indicating when the processing 
of the call is finished, note that most [handlers 
here|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/src/master/http.cpp#L640-L711]
 return void.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to