[
https://issues.apache.org/jira/browse/MESOS-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989200#comment-14989200
]
Felix Bechstein commented on MESOS-2353:
----------------------------------------
We are experiencing severe issues too. The master is using most of it's cpu
cycles for answering the master/state.json and metrics/snapshot requests. It
takes up to 30s to fetch the state.
We experience, that the master is getting slow in sending offers because of
that.
We noticed, that restarting the leader to force reelection, the problem goes
away for some time.
> Improve performance of the master's state.json endpoint for large clusters.
> ---------------------------------------------------------------------------
>
> Key: MESOS-2353
> URL: https://issues.apache.org/jira/browse/MESOS-2353
> Project: Mesos
> Issue Type: Improvement
> Components: master
> Reporter: Benjamin Mahler
> Labels: newbie, scalability, twitter
>
> The master's state.json endpoint consistently takes a long time to compute
> the JSON result, for large clusters:
> {noformat}
> $ time curl -s -o /dev/null localhost:5050/master/state.json
> Mon Jan 26 22:38:50 UTC 2015
> real 0m13.174s
> user 0m0.003s
> sys 0m0.022s
> {noformat}
> This can cause the master to get backlogged if there are many state.json
> requests in flight.
> Looking at {{perf}} data, it seems most of the time is spent doing memory
> allocation / de-allocation. This ticket will try to capture any low hanging
> fruit to speed this up. Possibly we can leverage moves if they are not
> already being used by the compiler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)