[
https://issues.apache.org/jira/browse/MESOS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468185#comment-16468185
]
Benjamin Mahler commented on MESOS-3202:
----------------------------------------
Re-opening this ticket to address the case where quota guarantees are not in
use. In the case that nobody has a quota guarantee, the expectation from many
users is that available resources can get allocated if there is a particular
role/framework that wants them. However, this is not ensured due to the issue
outlined above where we cycle through only the lowest share roles/frameworks
that don't want resources and starve the high share roles/frameworks.
An approach being discussed to address this is to introduce an optional offer
"exhaustiveness" behavior where we ensure each role/framework candidate gets an
offer from the agent before re-offering to those that have had an offer from
the agent.
> Avoid frameworks starving in DRF allocator.
> -------------------------------------------
>
> Key: MESOS-3202
> URL: https://issues.apache.org/jira/browse/MESOS-3202
> Project: Mesos
> Issue Type: Bug
> Reporter: Jörg Schad
> Priority: Major
>
> We currently run into issues with the DRF scheduler that frameworks do not
> receive offers (see https://github.com/mesosphere/marathon/issues/1931 for
> details).
> Imagine that we have 10 frameworks and unallocated resources from a single
> slave.
> Allocation interval is 1 sec, and refuse_seconds (i.e. the time for which a
> declined resource is filtered) is 3 sec across all frameworks.
> Allocator offers resources to framework 1 (according to DRF) which declines
> the offer immediately.
> In the next allocation interval framework 1 is skipped due to the declined
> offer before. Hence the next framework 2 is offered the resources, which it
> also declines.
> The same procedure in the next allocation interval (with framework 3).
> In the next allocation interval the refuse_seconds for framework 1 are over,
> and as it still has the lowest DRF share it gets the resource offered again,
> which it again declines. And the cycle begins again....
> Framework 4 (which is actually waiting for this resource) is never offered
> this resource.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)