[
https://issues.apache.org/jira/browse/MESOS-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler updated MESOS-6844:
-----------------------------------
Description:
The current allocation strategy is to make "coarse-grained" offers to the
frameworks, wherein each offer will contain all of the resources currently
available on the agent to the framework.
However, this "coarse-grained" invariant does not apply over time as resources
are freed and additional offers can be made, since we make another
"coarse-grained" offer without rescinding any existing outstanding offers.
This leads fragmentation of the offers for an agent (i.e. it is possible for
there to be multiple offers to one or more frameworks for the available
resources on an agent). There are a number of issues with this:
(1) In the case where the fragmented offers have been sent to multiple
frameworks, it's possible for none of the frameworks to have sufficient
resources to run anything. As the schedulers decline or hold on to these
offers, it may take a long time to make progress.
(2) A simple scheduler may be implemented to only operate without holding and
merging offers since this is more complex (e.g. how long to hold on to offers?
more complex offer management / matching). In this case there are some
pathological cases where the framework might not receive the single
un-fragmented offer (when each time the allocator makes an offer, it sees an
outstanding offer already as the DECLINE has not yet been processed).
The suggestion in this ticket is to explore imposing the "coarse-grained"
invariant by avoiding fragmenting the offers across multiple frameworks and
even for the same framework (we should look at these somewhat separately). This
can be achieved if the allocator has visibility into the offers and rescinds
outstanding offers for the agent prior to offering additionally freed resources
on the agent.
Note however, that this also has some negative implications for scheduling
throughput. Consider the case where there is a high degree of churn on an agent
due to a large number of small, short-lived tasks. In this case, the framework
would experience a lot of scheduling interference as it tries to accept offers
but the offers are rescinded frequently as the allocator attempts to
un-fragment the offers. There may be ways to mitigate this, if we had a
mechanism for "swapping" an offer to a scheduler, then we could allow
operations that were sent before the scheduler saw the offer be swapped with
more resources. We would have to try to stick to the same scheduler for an
agent so that we swap the offer for a single scheduler in favor of rescinding
from one scheduler and sending a new offer to a different scheduler. It may be
that different frameworks desire different behavior here.
This problem should also be examined in the context of optimistic resource
allocation.
was:
The current allocation strategy is to make "coarse-grained" offers to the
frameworks, wherein each offer will contain all of the resources currently
available on the agent to the framework.
However, this "coarse-grained" invariant does not apply over time as resources
are freed and additional offers can be made, since we make another
"coarse-grained" offer without rescinding any existing outstanding offers.
This leads fragmentation of the offers for an agent (i.e. it is possible for
there to be multiple offers to one or more frameworks for the available
resources on an agent). There are a number of issues with this:
(1) In the case where the fragmented offers have been sent to multiple
frameworks, it's possible for none of the frameworks to have sufficient
resources to run anything. As the schedulers decline or hold on to these
offers, it may take a long time to make progress.
(2) A simple scheduler may be implemented to only operate without holding and
merging offers since this is more complex (e.g. how long to hold on to offers?
more complex offer management / matching). In this case there are some
pathological cases where the framework might not receive the single
un-fragmented offer (when each time the allocator makes an offer, it sees an
outstanding offer already as the DECLINE has not yet been processed).
The suggestion in this ticket is to explore imposing the "coarse-grained"
invariant by avoiding fragmenting the offers across multiple frameworks and
even for the same framework (we should look at these somewhat separately). This
can be achieved if the allocator has visibility into the offers and rescinds
outstanding offers for the agent prior to offering additionally freed resources
on the agent.
Note however, that this also has some negative implications for scheduling
throughput. Consider the case where there is a high degree of churn on an agent
due to a large number of small, short-lived tasks. In this case, the framework
would experience a lot of scheduling interference as it tries to accept offers
but the offers are rescinded frequently as the allocator attempts to
un-fragment the offers. There may be ways to mitigate this, for example we
could allow operations on the rescinded offers so long as the operation can
still be applied and the allocation constraints (fairness / quota) are not
violated, but this needs more exploration. It may be that different frameworks
desire different behavior here.
This problem should also be examined in the context of optimistic resource
allocation.
> Avoid offer fragmentation between multiple frameworks / within a single
> framework.
> ----------------------------------------------------------------------------------
>
> Key: MESOS-6844
> URL: https://issues.apache.org/jira/browse/MESOS-6844
> Project: Mesos
> Issue Type: Epic
> Components: allocation
> Reporter: Benjamin Mahler
>
> The current allocation strategy is to make "coarse-grained" offers to the
> frameworks, wherein each offer will contain all of the resources currently
> available on the agent to the framework.
> However, this "coarse-grained" invariant does not apply over time as
> resources are freed and additional offers can be made, since we make another
> "coarse-grained" offer without rescinding any existing outstanding offers.
> This leads fragmentation of the offers for an agent (i.e. it is possible for
> there to be multiple offers to one or more frameworks for the available
> resources on an agent). There are a number of issues with this:
> (1) In the case where the fragmented offers have been sent to multiple
> frameworks, it's possible for none of the frameworks to have sufficient
> resources to run anything. As the schedulers decline or hold on to these
> offers, it may take a long time to make progress.
> (2) A simple scheduler may be implemented to only operate without holding and
> merging offers since this is more complex (e.g. how long to hold on to
> offers? more complex offer management / matching). In this case there are
> some pathological cases where the framework might not receive the single
> un-fragmented offer (when each time the allocator makes an offer, it sees an
> outstanding offer already as the DECLINE has not yet been processed).
> The suggestion in this ticket is to explore imposing the "coarse-grained"
> invariant by avoiding fragmenting the offers across multiple frameworks and
> even for the same framework (we should look at these somewhat separately).
> This can be achieved if the allocator has visibility into the offers and
> rescinds outstanding offers for the agent prior to offering additionally
> freed resources on the agent.
> Note however, that this also has some negative implications for scheduling
> throughput. Consider the case where there is a high degree of churn on an
> agent due to a large number of small, short-lived tasks. In this case, the
> framework would experience a lot of scheduling interference as it tries to
> accept offers but the offers are rescinded frequently as the allocator
> attempts to un-fragment the offers. There may be ways to mitigate this, if we
> had a mechanism for "swapping" an offer to a scheduler, then we could allow
> operations that were sent before the scheduler saw the offer be swapped with
> more resources. We would have to try to stick to the same scheduler for an
> agent so that we swap the offer for a single scheduler in favor of rescinding
> from one scheduler and sending a new offer to a different scheduler. It may
> be that different frameworks desire different behavior here.
> This problem should also be examined in the context of optimistic resource
> allocation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)