> On May 23, 2017, 4:56 p.m., James Peach wrote: > > src/master/allocator/sorter/drf/sorter.hpp > > Lines 316 (patched) > > <https://reviews.apache.org/r/59355/diff/3/?file=1727589#file1727589line316> > > > > Did you consider keeping separate vectors for active and inactive > > children? That would avoid the copying due to the insert() here, and I > > think it would simplify the sorting code later on.
Yeah, I considered that -- it is definitely a bit awkward to maintain the ordering invariant over a single `children` vector. The downside to keeping two vectors is that the code for inserting, removing, and updating clients becomes more complicated, because we need to search two vectors, check for duplicates in both vectors, etc. My initial preference was for a single vector, but you could definitely argue for two vectors instead. - Neil ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59355/#review175812 ----------------------------------------------------------- On May 23, 2017, 6:30 a.m., Neil Conway wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59355/ > ----------------------------------------------------------- > > (Updated May 23, 2017, 6:30 a.m.) > > > Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and > Jiang Yan Xu. > > > Bugs: MESOS-7521 > https://issues.apache.org/jira/browse/MESOS-7521 > > > Repository: mesos > > > Description > ------- > > Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the > same data structure used to store active clients. This resulted in a > significant performance regression when the vast majority of sorter > clients are inactive: when sorting clients and producing the sorted > order, we iterate over ALL clients (active and inactive), which could be > much slower than the old active-only implementation. > > This commit revises the sorter to ensure that inactive leaf nodes are > always stored at the end of their parent's list of child nodes. This > allows the sorter to stop early (at the first inactive leaf) when > iterating over a node's children, if we're only interested in applying > an operation to each active leaf or internal node. This change fixes the > observed performance regression relative to Mesos 1.2.0. > > > Diffs > ----- > > src/master/allocator/sorter/drf/sorter.hpp > fee58d6d1f08163e2a06a4a20c891fe535c3dcff > src/master/allocator/sorter/drf/sorter.cpp > 26b77f578f3235a8792c72d4575d607cdb2c7de7 > > > Diff: https://reviews.apache.org/r/59355/diff/3/ > > > Testing > ------- > > Initial perf testing: > > MESOS 1.2.0: > =================== > ``` > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > Using 5000 agents and 6000 frameworks > Added 6000 frameworks in 90.61248ms > Added 5000 agents in 38.788639509secs > allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > (48234 ms) > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total) > ``` > > MESOS in master branch: > =================== > ``` > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > Using 5000 agents and 6000 frameworks > Added 6000 frameworks in 295.603058ms > Added 5000 agents in 16.676030553secs > allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > (102353 ms) > [----------] 1 test from > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total) > ``` > > MESOS in master branch + this patch: > =================== > ``` > [ RUN ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > Using 5000 agents and 6000 frameworks > Added 6000 frameworks in 325.582494ms > Added 5000 agents in 16.311040922secs > allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 > frameworks suppressing offers > [ OK ] > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 > (24664 ms) > ``` > > > Thanks, > > Neil Conway > >
