Alexander Rukletsov created MESOS-9277: ------------------------------------------
Summary: UNRESERVE scheduler call be dropped if it loses the race with TEARDOWN. Key: MESOS-9277 URL: https://issues.apache.org/jira/browse/MESOS-9277 Project: Mesos Issue Type: Bug Components: scheduler api Affects Versions: 1.7.0, 1.6.1, 1.5.1 Reporter: Alexander Rukletsov A typical use pattern for a framework scheduler is to remove its reservations before tearing itself down. However, it is racy: {{UNRESERVE}} is a multi-stage action which aborts if the framework is removed in-between. *Solution 1* Let schedulers use operation feedback and expect them to wait for an ack for {{UNRESERVE}} before they send {{TEARDOWN}}. Kind of science fiction with a timeline of {{O(months)}} and still possibilities for the race if a scheduler does not comply. *Solution 2* Serialize calls for schedulers. For example, we can chain [handlers here|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/src/master/http.cpp#L640-L711] onto per-{{Master::Framework}} [{{process::Sequence}}|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/3rdparty/libprocess/include/process/sequence.hpp]. For that however, handlers must provide futures indicating when the processing of the call is finished, note that most [handlers here|https://github.com/apache/mesos/blob/6e21e94ddca5b776d44636fe3eba8500bf88dc25/src/master/http.cpp#L640-L711] return void. -- This message was sent by Atlassian JIRA (v7.6.3#76005)