Meng Zhu created MESOS-9777: ------------------------------- Summary: Consider doing an internal retry if reservation and etc. operations fail due to 409 conflict. Key: MESOS-9777 URL: https://issues.apache.org/jira/browse/MESOS-9777 Project: Mesos Issue Type: Improvement Components: master Reporter: Meng Zhu
A reservation request may return 409 Conflict: https://github.com/apache/mesos/blob/261d6ef497383795557aaca5dce426b4482eabea/src/master/http.cpp#L4026 It is due to the inherent race between the master and allocator actor. As illustrated here: https://github.com/apache/mesos/blob/261d6ef497383795557aaca5dce426b4482eabea/src/master/allocator/mesos/hierarchical.cpp#L992-L1008 This is not ideal and should be rare. However, it is hard for users to grasp this error. It seems to be beneficial for Mesos to retry the reservation operation internally for the user. -- This message was sent by Atlassian JIRA (v7.6.3#76005)