[
https://issues.apache.org/jira/browse/MESOS-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neil Conway updated MESOS-5698:
-------------------------------
Description:
Consider this sequence of events:
1. Slave connects, with 128MB of disk.
2. Master offers resources at slave to framework
3. Framework creates a dynamic reservation for 1MB and a persistent volume of
the same size on the slave's resources.
=> This invokes {{Master::apply}}, which invokes
{{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the
framework sorter and role sorter. If the framework's role has a configured
quota, it also invokes {{update}} on the quota role sorter -- in this case, the
framework's role has no quota, so the quota role sorter is *not* updated.
=> {{DRFSorter::update}} updates the *total* resources at a given slave,
among updating other state. New total resources will be 127MB of unreserved
disk and 1MB of reserved disk with a volume. Note that the quota role sorter
still thinks the slave has 128MB of unreserved disk.
4. The slave is removed from the cluster.
{{HierarchicalAllocatorProcess::removeSlave}} invokes:
{code}
roleSorter->remove(slaveId, slaves[slaveId].total);
quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable());
{code}
{{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and 1MB
of reserved disk with a volume. When we remove this from the quota role sorter,
we're left with total resources on the reserved slave of 1MB of unreserved
disk, since that is the result of subtracting <127MB unreserved, 1MB
reserved+volume> from <128MB unreserved>.
The implications of this can't be good: at minimum, we're leaking resources for
removed slaves in the quota role sorter. We're also introducing an
inconsistency between {{total_.resources\[slaveId\]}} and
{{total_.scalarQuantities}}, since the latter has already stripped-out
volume/reservation information.
was:
Consider this sequence of events:
1. Slave connects, with 128MB of disk.
2. Master offers resources at slave to framework
3. Framework creates a dynamic reservation for 1MB and a persistent volume of
the same size on the slave's resources.
=> This invokes {{Master::apply}}, which invokes
{{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the
framework sorter and role sorter. If the role has a configured quota, it also
invokes {{update}} on the quota role sorter.
=> {{DRFSorter::update}} updates the *total* resources at a given slave,
among updating other state. New total resources will be 127MB of unreserved
disk and 1MB of reserved disk with a volume. Note that the quota role sorter
still thinks the slave has 128MB of unreserved disk.
4. The slave is removed from the cluster.
{{HierarchicalAllocatorProcess::removeSlave}} invokes:
{code}
roleSorter->remove(slaveId, slaves[slaveId].total);
quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable());
{code}
{{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and 1MB
of reserved disk with a volume. When we remove this from the quota role sorter,
we're left with total resources on the reserved slave of 1MB of unreserved
disk, since that is the result of subtracting <127MB unreserved, 1MB
reserved+volume> from <128MB unreserved>.
The implications of this can't be good: at minimum, we're leaking resources for
removed slaves in the quota role sorter. We're also introducing an
inconsistency between {{total_.resources\[slaveId\]}} and
{{total_.scalarQuantities}}, since the latter has already stripped-out
volume/reservation information.
> Quota sorter not updated for resource changes at agent
> ------------------------------------------------------
>
> Key: MESOS-5698
> URL: https://issues.apache.org/jira/browse/MESOS-5698
> Project: Mesos
> Issue Type: Bug
> Components: allocation
> Reporter: Neil Conway
> Labels: mesosphere, quota
>
> Consider this sequence of events:
> 1. Slave connects, with 128MB of disk.
> 2. Master offers resources at slave to framework
> 3. Framework creates a dynamic reservation for 1MB and a persistent volume of
> the same size on the slave's resources.
> => This invokes {{Master::apply}}, which invokes
> {{allocator->updateAllocation}}, which invokes {{Sorter::update()}} on the
> framework sorter and role sorter. If the framework's role has a configured
> quota, it also invokes {{update}} on the quota role sorter -- in this case,
> the framework's role has no quota, so the quota role sorter is *not* updated.
> => {{DRFSorter::update}} updates the *total* resources at a given slave,
> among updating other state. New total resources will be 127MB of unreserved
> disk and 1MB of reserved disk with a volume. Note that the quota role sorter
> still thinks the slave has 128MB of unreserved disk.
> 4. The slave is removed from the cluster.
> {{HierarchicalAllocatorProcess::removeSlave}} invokes:
> {code}
> roleSorter->remove(slaveId, slaves[slaveId].total);
> quotaRoleSorter->remove(slaveId, slaves[slaveId].total.nonRevocable());
> {code}
> {{slaves\[slaveId\].total.nonRevocable()}} is 127MB of unreserved disk and
> 1MB of reserved disk with a volume. When we remove this from the quota role
> sorter, we're left with total resources on the reserved slave of 1MB of
> unreserved disk, since that is the result of subtracting <127MB unreserved,
> 1MB reserved+volume> from <128MB unreserved>.
> The implications of this can't be good: at minimum, we're leaking resources
> for removed slaves in the quota role sorter. We're also introducing an
> inconsistency between {{total_.resources\[slaveId\]}} and
> {{total_.scalarQuantities}}, since the latter has already stripped-out
> volume/reservation information.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)