[
https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627102#comment-14627102
]
Stephan Erb commented on MESOS-2940:
------------------------------------
The code seems to re-initialize the generator for each call to UUID::random
(https://github.com/apache/mesos/blob/c75d887f9ad4ce0806882c2541c3dc3eff443f37/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp#L31).
That does not seem right. AFAIK, one should seed it only once and reuse a
single generator across calls.
> Reconciliation is expensive for large numbers of tasks.
> -------------------------------------------------------
>
> Key: MESOS-2940
> URL: https://issues.apache.org/jira/browse/MESOS-2940
> Project: Mesos
> Issue Type: Improvement
> Components: master
> Reporter: Benjamin Mahler
> Assignee: Benjamin Mahler
> Priority: Critical
> Labels: twitter
> Fix For: 0.23.0
>
> Attachments: perf-kernel.svg
>
>
> We've observed that both implicit and explicit reconciliation are expensive
> for large numbers of tasks:
> {noformat: title=Explicit O(100,000) tasks: 70secs}
> I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state
> reconciliation for N tasks of framework F (NAME) at S@IP:PORT
> I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R
> of framework F on slave S at slave(1)@IP:PORT (HOST)
> {noformat}
> {noformat: title=Implicit with O(100,000) tasks: 60secs}
> I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state
> reconciliation for framework F (NAME) at S@IP:PORT
> I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S
> due to health check timeout
> {noformat}
> Let's add a benchmark to see if there are any bottlenecks here, and to guide
> improvements.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)