[ https://issues.apache.org/jira/browse/MESOS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neil Conway updated MESOS-5698: ------------------------------- Description: Consider this sequence of events: 1. Slave connects, with 128MB of disk. 2. Master offers resources at slave to framework 3. Framework creates a dynamic reservation for 1MB and a persistent volume of the same size on the slave's resources. => This invokes {{Master::apply}}, which invokes {{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the framework sorter and role sorter. If the framework's role has a configured quota, it also invokes {{update}} on the quota role sorter -- in this case, the framework's role has no quota, so the quota role sorter is *not* updated. => {{DRFSorter::update}} updates the *total* resources at a given slave, among updating other state. New total resources will be 127MB of unreserved disk and 1MB of reserved disk with a volume. Note that the quota role sorter still thinks the slave has 128MB of unreserved disk. 4. The slave is removed from the cluster. {{HierarchicalAllocatorProcess::removeSlave}} invokes: {code} roleSorter->remove(slaveId, slaves[slaveId].total); quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable()); {code} {{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and 1MB of reserved disk with a volume. When we remove this from the quota role sorter, we're left with total resources on the reserved slave of 1MB of unreserved disk, since that is the result of subtracting <127MB unreserved, 1MB reserved+volume> from <128MB unreserved>. The implications of this can't be good: at minimum, we're leaking resources for removed slaves in the quota role sorter. We're also introducing an inconsistency between {{total_.resources\[slaveId\]}} and {{total_.scalarQuantities}}, since the latter has already stripped-out volume/reservation information. was: Consider this sequence of events: 1. Slave connects, with 128MB of disk. 2. Master offers resources at slave to framework 3. Framework creates a dynamic reservation for 1MB and a persistent volume of the same size on the slave's resources. => This invokes {{Master::apply}}, which invokes {{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the framework sorter and role sorter. If the role has a configured quota, it also invokes {{update}} on the quota role sorter. => {{DRFSorter::update}} updates the *total* resources at a given slave, among updating other state. New total resources will be 127MB of unreserved disk and 1MB of reserved disk with a volume. Note that the quota role sorter still thinks the slave has 128MB of unreserved disk. 4. The slave is removed from the cluster. {{HierarchicalAllocatorProcess::removeSlave}} invokes: {code} roleSorter->remove(slaveId, slaves[slaveId].total); quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable()); {code} {{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and 1MB of reserved disk with a volume. When we remove this from the quota role sorter, we're left with total resources on the reserved slave of 1MB of unreserved disk, since that is the result of subtracting <127MB unreserved, 1MB reserved+volume> from <128MB unreserved>. The implications of this can't be good: at minimum, we're leaking resources for removed slaves in the quota role sorter. We're also introducing an inconsistency between {{total_.resources\[slaveId\]}} and {{total_.scalarQuantities}}, since the latter has already stripped-out volume/reservation information. > Quota sorter not updated for resource changes at agent > ------------------------------------------------------ > > Key: MESOS-5698 > URL: https://issues.apache.org/jira/browse/MESOS-5698 > Project: Mesos > Issue Type: Bug > Components: allocation > Reporter: Neil Conway > Labels: mesosphere, quota > > Consider this sequence of events: > 1. Slave connects, with 128MB of disk. > 2. Master offers resources at slave to framework > 3. Framework creates a dynamic reservation for 1MB and a persistent volume of > the same size on the slave's resources. > => This invokes {{Master::apply}}, which invokes > {{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the > framework sorter and role sorter. If the framework's role has a configured > quota, it also invokes {{update}} on the quota role sorter -- in this case, > the framework's role has no quota, so the quota role sorter is *not* updated. > => {{DRFSorter::update}} updates the *total* resources at a given slave, > among updating other state. New total resources will be 127MB of unreserved > disk and 1MB of reserved disk with a volume. Note that the quota role sorter > still thinks the slave has 128MB of unreserved disk. > 4. The slave is removed from the cluster. > {{HierarchicalAllocatorProcess::removeSlave}} invokes: > {code} > roleSorter->remove(slaveId, slaves[slaveId].total); > quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable()); > {code} > {{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and > 1MB of reserved disk with a volume. When we remove this from the quota role > sorter, we're left with total resources on the reserved slave of 1MB of > unreserved disk, since that is the result of subtracting <127MB unreserved, > 1MB reserved+volume> from <128MB unreserved>. > The implications of this can't be good: at minimum, we're leaking resources > for removed slaves in the quota role sorter. We're also introducing an > inconsistency between {{total_.resources\[slaveId\]}} and > {{total_.scalarQuantities}}, since the latter has already stripped-out > volume/reservation information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)