[ 
https://issues.apache.org/jira/browse/MESOS-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593660#comment-14593660
 ] 

Jie Yu commented on MESOS-2891:
-------------------------------

According to the flame graph [~bmahler] attached, seems that the most expensive 
calculation is here:
https://github.com/apache/mesos/blob/master/src/master/allocator/sorter/drf/sorter.cpp#L286

If a cluster with tens of thousands of slaves, summing the entire hashmap 
returned from 'allocation[name]' is definitely expensive.

We could also try to optimize Resources::operator +=. Currently, if the right 
hand side of the operator is a single 'resource', we'll call 'validate' on the 
resource before performing the operation. The validation is expensive. Most of 
the time, the validation is not necessary since most of the time, the 
'resource' has already been verified before.

> Performance regression in hierarchical allocator.
> -------------------------------------------------
>
>                 Key: MESOS-2891
>                 URL: https://issues.apache.org/jira/browse/MESOS-2891
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation, master
>            Reporter: Benjamin Mahler
>            Priority: Blocker
>              Labels: twitter
>         Attachments: Screen Shot 2015-06-18 at 5.02.26 PM.png, perf-kernel.svg
>
>
> For large clusters, the 0.23.0 allocator cannot keep up with the volume of 
> slaves. After the following slave was re-registered, it took the allocator a 
> long time to work through the backlog of slaves to add:
> {noformat:title=45 minute delay}
> I0618 18:55:40.738399 10172 master.cpp:3419] Re-registered slave 
> 20150422-211121-2148346890-5050-3253-S4695
> I0618 19:40:14.960636 10164 hierarchical.hpp:496] Added slave 
> 20150422-211121-2148346890-5050-3253-S4695
> {noformat}
> Empirically, 
> [addSlave|https://github.com/apache/mesos/blob/dda49e688c7ece603ac7a04a977fc7085c713dd1/src/master/allocator/mesos/hierarchical.hpp#L462]
>  and 
> [updateSlave|https://github.com/apache/mesos/blob/dda49e688c7ece603ac7a04a977fc7085c713dd1/src/master/allocator/mesos/hierarchical.hpp#L533]
>  have become expensive.
> Some timings from a production cluster reveal that the allocator spending in 
> the low tens of milliseconds for each call to {{addSlave}} and 
> {{updateSlave}}, when there are tens of thousands of slaves this amounts to 
> the large delay seen above.
> We also saw a slow steady increase in memory consumption, hinting further at 
> a queue backup in the allocator.
> A synthetic benchmark like we did for the registrar would be prudent here, 
> along with visibility into the allocator's queue size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to