[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627102#comment-14627102 ] Stephan Erb commented on MESOS-2940: The code seems to re-initialize the generator for each call to UUID::random (https://github.com/apache/mesos/blob/c75d887f9ad4ce0806882c2541c3dc3eff443f37/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp#L31). That does not seem right. AFAIK, one should seed it only once and reuse a single generator across calls. Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627115#comment-14627115 ] Stephan Erb commented on MESOS-2940: Is using a thread local generator sufficient to guarantee that those UUIDs will truly be unique? Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627141#comment-14627141 ] Kevin Sweeney commented on MESOS-2940: -- [~bmahler] Since this change resulted in some wire-format backwards-incompatibility would it make sense to back it out in favor of the fast pRNG solution? Or do you think that ship has already sailed? Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627106#comment-14627106 ] Kevin Sweeney commented on MESOS-2940: -- It's creating and seeding a new Mersenne Twister RNG on every call to UUID::random(), so you're not using the pRNG at all, you're using expensive entropy. From https://github.com/3rdparty/stout/blob/master/include/stout/uuid.hpp#L18 {code} return UUID(boost::uuids::random_generator()()); {code} This calls the boost::uuids::basic_random_generatormt19937() default constructor, which will always seed the provided RNG: {code} { // seed the random number generator detail::seed(*pURNG); } {code} This uses http://www.boost.org/doc/libs/1_53_0/boost/uuid/seed_rng.hpp which, according to documentation is very slow: {noformat} // It creates random numbers from a sha1 hash of data from a variary of sources, // all of which are standard function calls. It produces random numbers slowly. // Peter Dimov provided the details of sha1_random_digest_(). // see http://archives.free.net.ph/message/20070507.175609.4c4f503a.en.html {noformat} To improve performance you'd want to use one of the explicit constructors: {{explicit basic_random_generator(UniformRandomNumberGenerator* pGen)}} and provide an instance of a UniformRandomNumberGenerator. You probably want this to be a thread-local, as generator engines are stateful. {code} // keep a pointer to a random number generator // don't seed a given random number generator explicit basic_random_generator(UniformRandomNumberGenerator* pGen) {code} Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627118#comment-14627118 ] Benjamin Mahler commented on MESOS-2940: Filed: MESOS-3046 Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627075#comment-14627075 ] Benjamin Mahler commented on MESOS-2940: We wrap boost's 'uuid' which uses the Mersenne Twister algorithm for PRNG, from what I can tell: http://www.boost.org/doc/libs/1_53_0/boost/uuid/random_generator.hpp http://www.boost.org/doc/libs/1_53_0/boost/random/mersenne_twister.hpp Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627116#comment-14627116 ] Benjamin Mahler commented on MESOS-2940: Thanks for spotting this [~StephanErb] and [~kevints]! Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627192#comment-14627192 ] Kevin Sweeney commented on MESOS-2940: -- Aha, sounds like it's already taken care of (apologies for not seeing that, still catching up on email after vacation). Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627203#comment-14627203 ] Benjamin Mahler commented on MESOS-2940: No worries, glad you guys took a look here and found the UUID seeding issue I missed! Hope you had a good vacation.. :) Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627182#comment-14627182 ] Benjamin Mahler commented on MESOS-2940: The backwards incompatibility was a bug on my part and was fixed in MESOS-3025, which is getting cherry-picked for 0.23.0. Is there still a backwards incompatibility? We've also been planning to stop setting the UUID for StatusUpdateMessages that do not need acknowledgments, similar to what is being done for [TaskStatus.uuid|https://github.com/apache/mesos/blob/0.23.0-rc3/include/mesos/mesos.proto#L930]. Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2940) Reconciliation is expensive for large numbers of tasks.
[ https://issues.apache.org/jira/browse/MESOS-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627169#comment-14627169 ] Kevin Sweeney commented on MESOS-2940: -- Yes, as long as each is seeded on its own thread The stout implementation could be changed to something like: {code} public: static UUID random() { return UUID(uuid_generator()); } //... private: thread_local boost::uuids::random_generator uuid_generator; //... {code} Then each new thread would seed a new random engine once and use the engine for all future uuid generation. Reconciliation is expensive for large numbers of tasks. --- Key: MESOS-2940 URL: https://issues.apache.org/jira/browse/MESOS-2940 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Benjamin Mahler Priority: Critical Labels: twitter Fix For: 0.23.0 Attachments: perf-kernel.svg We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks: {noformat: title=Explicit O(100,000) tasks: 70secs} I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST) {noformat} {noformat: title=Implicit with O(100,000) tasks: 60secs} I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout {noformat} Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)