> On April 18, 2017, 12:21 a.m., Benjamin Mahler wrote: > > Thanks for taking this on Ilya! Just a few suggestions below per our > > offline discussion. It would be great to update the description to include > > the context, mainly that we avoid the `protobuf::State` wrapper in favor of > > performing manual serialization/deserialization in the registrar, in order > > to avoid expensive protobuf copying. Also that, we should explore > > performance improvements to `protobuf::State` in order to make it usable in > > the registrar without regressing on the performance of this approach.
Thanks for the review! Done. > On April 18, 2017, 12:21 a.m., Benjamin Mahler wrote: > > src/master/registrar.cpp > > Line 215 (original), 217-218 (patched) > > <https://reviews.apache.org/r/58355/diff/1/?file=1688435#file1688435line218> > > > > It would be nice to store the `State` class alongside these and put the > > TODO here: > > > > ``` > > // TODO(<username>): We use the "untyped" `State` class here and > > perform > > // the protobuf (de)serialization manually within the Registrar, > > because > > // the use of `protobuf::State` incurs a dramatic peformance cost from > > // protobuf copying. We should explore using `protobuf::State`, which > > will > > // require move support and other copy elimination to maintain the > > // performance of the current approach. > > State* state; > > > > // Per the TODO above, we store both serialized and deserialized > > versions > > // of the `Registry` protobuf. If we're able to move to > > `protobuf::State`, > > // we could just store a single `protobuf::state::Variable<Registry>`. > > Option<Variable> variable; > > Option<Registry> registry; > > ``` > > > > By the way, would be nice if these both use or don't use underscores > > consistently. Done. `registry` without underscore conflicted with private `registry()` method. Renamed the method. > On April 18, 2017, 12:21 a.m., Benjamin Mahler wrote: > > src/master/registrar.cpp > > Line 334 (original), 337 (patched) > > <https://reviews.apache.org/r/58355/diff/1/?file=1688435#file1688435line338> > > > > Chatted with Ilya offline about using the raw untyped `State` class > > rather than using the protobuf class and bypassing the overridden method > > here. > > > > We can then frame this patch as being an avoidance of the protobuf > > `State` wrapper due to the extra expense incurred from protobuf copying, > > and leave a TODO in the code that captures this for posterity. Avoided `protobuf::State` in favor of "untyped" `State`. > On April 18, 2017, 12:21 a.m., Benjamin Mahler wrote: > > src/master/registrar.cpp > > Lines 449-450 (original), 464-465 (patched) > > <https://reviews.apache.org/r/58355/diff/1/?file=1688435#file1688435line465> > > > > Should we say we use an Owned here to avoid copying, since protobuf > > doesn't support move construction? Added a comment. - Ilya ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58355/#review172147 ----------------------------------------------------------- On April 18, 2017, 10:11 p.m., Ilya Pronin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/58355/ > ----------------------------------------------------------- > > (Updated April 18, 2017, 10:11 p.m.) > > > Review request for mesos and Benjamin Mahler. > > > Bugs: MESOS-7376 > https://issues.apache.org/jira/browse/MESOS-7376 > > > Repository: mesos > > > Description > ------- > > When the number of agents is large every `Registry` copy operation takes > a lot of time (~0.4 sec with 55k agents), because it involves deep > copying a big object tree. Because of that, the use of `protobuf::State` > in `Registrar` incurs a dramatic performance cost from multiple protobuf > copying. > > This patch drops the use of `protobuf::State` in `Registrar` in favor of > "untyped" `State` and manual serialization/deserialization in order to > minimize `Registry` copying and keep registry update timings at > acceptable values. > > Performance improvements to `protobuf::State` should be explored in > order to make it usable in the registrar without regressing on the > performance of this approach. > > > Diffs > ----- > > src/master/main.cpp 90d159e248924f6806271dda1ced6d37dfa5f5c9 > src/master/registrar.hpp a70132b0901aedf2c562a561a9d140b58fdfb4a0 > src/master/registrar.cpp 0029cc77628b5bb2a7b1ff551fb42b3eaf7b4fb1 > src/tests/cluster.hpp 250b12fcffd035834817ff6060eb80c2cf3e0246 > src/tests/cluster.cpp 02590a27aadd0583177f21a57ec3d05fe1542f42 > src/tests/mock_registrar.hpp cdcc6990a183e8c4911d3353584ed808ee88dd72 > src/tests/mock_registrar.cpp 8643e4cb06cef44fa0e342df0aa0f00823e1e268 > src/tests/registrar_tests.cpp 5c6fb565ee48378224bab091ccdf9d5d8f3c7614 > > > Diff: https://reviews.apache.org/r/58355/diff/2/ > > > Testing > ------- > > Benchmark > ========= > Ran `MESOS_VERBOSE=1 ./src/mesos-tests --benchmark > --gtest_filter='*Registrar_BENCHMARK_Test.Performance/3'`. > > Before: > ``` > I0411 10:04:11.726016 11802 registrar.cpp:508] Successfully updated the > registry in 89.478144ms > I0411 10:04:13.860827 11803 registrar.cpp:508] Successfully updated the > registry in 216.688896ms > I0411 10:04:15.167768 11803 registrar.cpp:508] Successfully updated the > registry in 1.29364096secs > I0411 10:04:18.967394 11803 registrar.cpp:508] Successfully updated the > registry in 3.696552192secs > I0411 10:04:25.631009 11803 registrar.cpp:508] Successfully updated the > registry in 6.267425024secs > I0411 10:04:42.625507 11803 registrar.cpp:508] Successfully updated the > registry in 15.876419072secs > I0411 10:04:44.209377 11787 registrar_tests.cpp:1262] Admitted 50000 agents > in 30.479743816secs > I0411 10:05:04.446650 11820 registrar.cpp:508] Successfully updated the > registry in 18.338545152secs > I0411 10:05:21.171001 11820 registrar.cpp:508] Successfully updated the > registry in 15.31903872secs > I0411 10:05:37.592319 11820 registrar.cpp:508] Successfully updated the > registry in 14.863101952secs > I0411 10:05:39.099174 11787 registrar_tests.cpp:1276] Marked 50000 agents > reachable in 53.593596102secs > ../../src/tests/registrar_tests.cpp:1287: Failure > Failed to wait 15secs for registry > ``` > After: > ``` > I0411 15:19:12.228904 40643 registrar.cpp:524] Successfully updated the > registry in 91.262208ms > I0411 15:19:14.543190 40660 registrar.cpp:524] Successfully updated the > registry in 377.45408ms > I0411 15:19:15.707006 40660 registrar.cpp:524] Successfully updated the > registry in 1.138724096secs > I0411 15:19:18.267305 40660 registrar.cpp:524] Successfully updated the > registry in 2.466145792secs > I0411 15:19:19.092073 40660 registrar.cpp:524] Successfully updated the > registry in 523.11296ms > I0411 15:19:20.809330 40648 registrar.cpp:524] Successfully updated the > registry in 892.141824ms > I0411 15:19:21.194135 40622 registrar_tests.cpp:1262] Admitted 50000 agents > in 6.938952085secs > I0411 15:19:26.973904 40637 registrar.cpp:524] Successfully updated the > registry in 3.938064128secs > I0411 15:19:28.631865 40637 registrar.cpp:524] Successfully updated the > registry in 1.116326144secs > I0411 15:19:30.222944 40660 registrar.cpp:524] Successfully updated the > registry in 911.86688ms > I0411 15:19:30.678509 40622 registrar_tests.cpp:1276] Marked 50000 agents > reachable in 8.249523305secs > I0411 15:19:35.138797 40645 registrar.cpp:524] Successfully updated the > registry in 815.439104ms > I0411 15:19:41.783651 40622 registrar_tests.cpp:1288] Recovered 50000 agents > (8238915B) in 10.963297677secs > I0411 15:19:47.431670 40657 registrar.cpp:524] Successfully updated the > registry in 3.960920064secs > I0411 15:20:13.769872 40657 registrar.cpp:524] Successfully updated the > registry in 1.169234944secs > I0411 15:21:49.685801 40657 registrar.cpp:524] Successfully updated the > registry in 264.850688ms > Removed 50000 agents in 2.12256788111667mins > ``` > Agents removal seems to be O(n^2) because of linear lookup, hence 2 mins. > > Unit > ==== > Ran `make check`. > > > Thanks, > > Ilya Pronin > >
