Hi Tom,

I suspect you may be tripping the following issue:
https://issues.apache.org/jira/browse/MESOS-4302

Please have a read through this and see if it applies here. You may also be
able to apply the fix to your cluster to see if that helps things.

Ben

On Wed, Jan 20, 2016 at 10:19 AM, Tom Arnfeld <t...@duedil.com> wrote:

> Hey,
>
> I've noticed some interesting behaviour recently when we have lots of
> different frameworks connected to our Mesos cluster at once, all using a
> variety of different shares. Some of the frameworks don't get offered more
> resources (for long periods of time, hours even) leaving the cluster under
> utilised.
>
> Here's an example state where we see this happen..
>
> Framework 1 - 13% (user A)
> Framework 2 - 22% (user B)
> Framework 3 - 4% (user C)
> Framework 4 - 0.5% (user C)
> Framework 5 - 1% (user C)
> Framework 6 - 1% (user C)
> Framework 7 - 1% (user C)
> Framework 8 - 0.8% (user C)
> Framework 9 - 11% (user D)
> Framework 10 - 7% (user C)
> Framework 11 - 1% (user C)
> Framework 12 - 1% (user C)
> Framework 13 - 6% (user E)
>
> In this example, there's another ~30% of the cluster that is unallocated,
> and it stays like this for a significant amount of time until something
> changes, perhaps another user joins and allocates the rest.... chunks of
> this spare resource is offered to some of the frameworks, but not all of
> them.
>
> I had always assumed that when lots of frameworks were involved,
> eventually the frameworks that would keep accepting resources indefinitely
> would consume the remaining resource, as every other framework had rejected
> the offers.
>
> Could someone elaborate a little on how the DRF allocator / sorter handles
> this situation, is this likely to be related to the different users being
> used? Is there a way to mitigate this?
>
> We're running version 0.23.1.
>
> Cheers,
>
> Tom.
>

Reply via email to