[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.

Kevin Sweeney (JIRA) Tue, 14 Jul 2015 14:52:23 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627106#comment-14627106
 ]


Kevin Sweeney commented on MESOS-2940:
--------------------------------------

It's creating and seeding a new Mersenne Twister RNG on every call to 
UUID::random(), so you're not using the pRNG at all, you're using expensive 
entropy.

>From https://github.com/3rdparty/stout/blob/master/include/stout/uuid.hpp#L18
{code}
    return UUID(boost::uuids::random_generator()());
{code}

This calls the boost::uuids::basic_random_generator<mt19937>() default 
constructor, which will always seed the provided RNG:
{code}
    {
        // seed the random number generator
        detail::seed(*pURNG);
    }
{code}

This uses http://www.boost.org/doc/libs/1_53_0/boost/uuid/seed_rng.hpp which, 
according to documentation is very slow:
{noformat}
// It creates random numbers from a sha1 hash of data from a variary of sources,
// all of which are standard function calls.  It produces random numbers slowly.
// Peter Dimov provided the details of sha1_random_digest_().
// see http://archives.free.net.ph/message/20070507.175609.4c4f503a.en.html
{noformat}

To improve performance you'd want to use one of the explicit constructors: 
{{explicit basic_random_generator(UniformRandomNumberGenerator* pGen)}} and 
provide an instance of a UniformRandomNumberGenerator. You probably want this 
to be a thread-local, as generator engines are stateful.

{code}
    // keep a pointer to a random number generator
    // don't seed a given random number generator
    explicit basic_random_generator(UniformRandomNumberGenerator* pGen)
{code}

> Reconciliation is expensive for large numbers of tasks.
> -------------------------------------------------------
>
>                 Key: MESOS-2940
>                 URL: https://issues.apache.org/jira/browse/MESOS-2940
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Critical
>              Labels: twitter
>             Fix For: 0.23.0
>
>         Attachments: perf-kernel.svg
>
>
> We've observed that both implicit and explicit reconciliation are expensive 
> for large numbers of tasks:
> {noformat: title=Explicit O(100,000) tasks: 70secs}
> I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state 
> reconciliation for N tasks of framework F (NAME) at S@IP:PORT
> I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R 
> of framework F on slave S at slave(1)@IP:PORT (HOST)
> {noformat}
> {noformat: title=Implicit with O(100,000) tasks: 60secs}
> I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state 
> reconciliation for framework F (NAME) at S@IP:PORT
> I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S 
> due to health check timeout
> {noformat}
> Let's add a benchmark to see if there are any bottlenecks here, and to guide 
> improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.

Reply via email to