Anindya Sinha created MESOS-6444: ------------------------------------ Summary: Ensure single copy of shared count of total resources in role sorter. Key: MESOS-6444 URL: https://issues.apache.org/jira/browse/MESOS-6444 Project: Mesos Issue Type: Bug Components: general Reporter: Anindya Sinha Assignee: Anindya Sinha
We maintain a single copy of shared resource in the role and quota sorter's total resources. So, when we update these resources, we need to ensure that we only count a single copy even though the framework sorter may be returned multiple copies of a shared resource. If not, then we may fail here in void DRFSorter::remove(const SlaveID& slaveId, const Resources& resources): CHECK(total_.resources[slaveId].contains(resources)); 2 scenarios where this can happen: (1) Framework does a RESERVE, CREATE of shared volume and LAUNCH of a task using shared volume in a single ACCEPT. On subsequent offer, it does another set of RESERVE, CREATE and LAUNCH which would hit this condition. (2) Say we have a framework of a certain role which has been offered 2 persistent volumes, say PV1 (regular persistent volume), and PV2 (shared persistent volume). Launch a long lived task using PV2. Launch a short lived task using PV1. PV1 terminates, and then issue a DESTROY on PV1 => Fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)