Andrei Sekretenko created MESOS-10015:
-----------------------------------------

             Summary: HierarchicalAllocatorProcess::updateAvailable() can stall 
the allocator with a huge number of reservations on an agent.
                 Key: MESOS-10015
                 URL: https://issues.apache.org/jira/browse/MESOS-10015
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.9.0, 1.8.1, 1.7.2, 1.6.2, 1.5.3
            Reporter: Andrei Sekretenko
            Assignee: Andrei Sekretenko


Currently, updateAvailable() called for a single-object Resources for a single 
framework on a single slave requires `(total number of frameworks) * (number of 
resource objects per this slave)^2` calls of `Resource::addable()`

In a cluster with a large number of frameworks this results in severe 
degradation of allocator performance  when a bunch of RESERVE/UNRESERVE 
operations occurs for an agent with hundreds of unique resources. 

On our testing cluster task we observed task scheduling delays up to 30 minutes 
due to allocator being occupied with processing UNRESERVE operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to