Anindya Sinha created MESOS-6444:
------------------------------------

             Summary: Ensure single copy of shared count of total resources in 
role sorter.
                 Key: MESOS-6444
                 URL: https://issues.apache.org/jira/browse/MESOS-6444
             Project: Mesos
          Issue Type: Bug
          Components: general
            Reporter: Anindya Sinha
            Assignee: Anindya Sinha


We maintain a single copy of shared resource in the role and quota sorter's 
total resources. So, when we update these resources, we need to  ensure that we 
only count a single copy even though the framework sorter may be returned 
multiple copies of a shared resource.

If not, then we may fail here in void DRFSorter::remove(const SlaveID& slaveId, 
const Resources& resources):
    CHECK(total_.resources[slaveId].contains(resources));

2 scenarios where this can happen:
(1) Framework does a RESERVE, CREATE of shared volume and LAUNCH of a task 
using shared volume in a single ACCEPT. On subsequent offer, it does another 
set of RESERVE, CREATE and LAUNCH which would hit this condition.

(2) Say we have a framework of a certain role which has been offered 2 
persistent volumes, say PV1 (regular persistent volume), and PV2 (shared 
persistent volume).
Launch a long lived task using PV2.
Launch a short lived task using PV1.
PV1 terminates, and then issue a DESTROY on PV1 => Fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to