----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65303/ -----------------------------------------------------------
Review request for Aurora and Jordan Ly. Repository: aurora Description ------- Use `ArrayDeque` rather than `HashSet` for fetchTasks, and use imperative style rather than functional. I arrived at this result after running benchmarks with some of the other usual suspects (`ArrayList`, `LinkedList`). This patch also enables stack and heap profilers in jmh (more details [here](http://hg.openjdk.java.net/codetools/jmh/file/25d8b2695bac/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_35_Profilers.java)), providing insight into the heap impact of changes. I started this change with a heap profiler as the primary motivation, and ended up using it to guide this improvement. Diffs ----- build.gradle 64af7ae src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java b59999c Diff: https://reviews.apache.org/r/65303/diff/1/ Testing ------- Full benchmark summary for `TaskStoreBenchmarks.MemFetchTasksBenchmark` is at the bottom, but here is an abridged version. It shows that task fetch throughput universally improves by at least 2x, and heap allocation reduces by at least the same factor. Overall GC time increases slightly as captured here, but the stddev was anecdotally high across runs. I chose to present this output as a caveat and a discussion point. If you scroll to the full output at the bottom, you will see some more granular allocation data. Please note that the `norm` stats are normalized for the number of operations, which i find to be the most useful measure for validating a change. Quoting the jmh sample link above: ```quote It is often useful to look into non-normalized counters to see if the test is allocation/GC-bound (figure the allocation pressure "ceiling" for your configuration!), and normalized counters to see the more precise benchmark behavior. ``` Prior to this patch: ```console Benchmark (numTasks) Score Error Units 10000 1066.632 ± 266.924 ops/s ·gc.alloc.rate.norm 10000 289227.205 ± 8888.051 B/op ·gc.count 10000 24.000 counts ·gc.time 10000 103.000 ms 50000 84.444 ± 32.620 ops/s ·gc.alloc.rate.norm 50000 3831210.967 ± 840844.713 B/op ·gc.count 50000 21.000 counts ·gc.time 50000 1407.000 ms 100000 38.645 ± 20.557 ops/s ·gc.alloc.rate.norm 100000 13555430.931 ± 6787344.701 B/op ·gc.count 100000 52.000 counts ·gc.time 100000 3304.000 ms ``` With this patch: ```console Benchmark (numTasks) Score Error Units 10000 2851.288 ± 481.472 ops/s ·gc.alloc.rate.norm 10000 145281.908 ± 2223.621 B/op ·gc.count 10000 39.000 counts ·gc.time 10000 130.000 ms 50000 297.380 ± 35.681 ops/s ·gc.alloc.rate.norm 50000 1183791.866 ± 77487.278 B/op ·gc.count 50000 25.000 counts ·gc.time 50000 1821.000 ms 100000 122.211 ± 81.618 ops/s ·gc.alloc.rate.norm 100000 4364450.973 ± 2856586.882 B/op ·gc.count 100000 52.000 counts ·gc.time 100000 3698.000 ms ``` **Full benchmark output** Prior to this patch: ```console Benchmark (numTasks) Mode Cnt Score Error Units TaskStoreBenchmarks.MemFetchTasksBenchmark.run 10000 thrpt 5 1066.632 ± 266.924 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 10000 thrpt 5 286.647 ± 62.371 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 10000 thrpt 5 289227.205 ± 8888.051 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 10000 thrpt 5 291.263 ± 159.266 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 10000 thrpt 5 294277.617 ± 166069.041 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 10000 thrpt 5 1.218 ± 1.029 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 10000 thrpt 5 1220.540 ± 708.455 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 10000 thrpt 5 24.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 10000 thrpt 5 103.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 10000 thrpt NaN --- TaskStoreBenchmarks.MemFetchTasksBenchmark.run 50000 thrpt 5 84.444 ± 32.620 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 50000 thrpt 5 267.018 ± 27.389 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 50000 thrpt 5 3831210.967 ± 840844.713 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 50000 thrpt 5 258.565 ± 149.845 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 50000 thrpt 5 3707563.530 ± 2262218.319 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen 50000 thrpt 5 4.487 ± 18.053 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm 50000 thrpt 5 63848.757 ± 264487.651 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 50000 thrpt 5 6.034 ± 3.651 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 50000 thrpt 5 87385.381 ± 75159.508 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 50000 thrpt 5 21.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 50000 thrpt 5 1407.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 50000 thrpt NaN --- TaskStoreBenchmarks.MemFetchTasksBenchmark.run 100000 thrpt 5 38.645 ± 20.557 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 100000 thrpt 5 381.453 ± 63.491 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 100000 thrpt 5 13555430.931 ± 6787344.701 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 100000 thrpt 5 389.816 ± 123.320 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 100000 thrpt 5 13823571.735 ± 6642604.600 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen 100000 thrpt 5 1.947 ± 16.766 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm 100000 thrpt 5 92330.241 ± 794991.221 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 100000 thrpt 5 11.934 ± 18.565 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 100000 thrpt 5 414896.926 ± 551658.959 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 100000 thrpt 5 52.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 100000 thrpt 5 3304.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 100000 thrpt NaN --- ``` With this patch: ```console Benchmark (numTasks) Mode Cnt Score Error Units TaskStoreBenchmarks.MemFetchTasksBenchmark.run 10000 thrpt 5 2851.288 ± 481.472 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 10000 thrpt 5 384.383 ± 58.697 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 10000 thrpt 5 145281.908 ± 2223.621 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 10000 thrpt 5 388.851 ± 114.120 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 10000 thrpt 5 147171.915 ± 50430.527 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 10000 thrpt 5 1.264 ± 0.980 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 10000 thrpt 5 479.848 ± 420.881 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 10000 thrpt 5 39.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 10000 thrpt 5 130.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 10000 thrpt NaN --- TaskStoreBenchmarks.MemFetchTasksBenchmark.run 50000 thrpt 5 297.380 ± 35.681 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 50000 thrpt 5 288.839 ± 19.035 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 50000 thrpt 5 1183791.866 ± 77487.278 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 50000 thrpt 5 296.587 ± 125.148 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 50000 thrpt 5 1214497.578 ± 457975.153 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen 50000 thrpt 5 6.942 ± 23.492 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm 50000 thrpt 5 28880.733 ± 99593.659 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 50000 thrpt 5 6.440 ± 3.887 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 50000 thrpt 5 26354.762 ± 14876.857 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 50000 thrpt 5 25.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 50000 thrpt 5 1821.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 50000 thrpt NaN --- TaskStoreBenchmarks.MemFetchTasksBenchmark.run 100000 thrpt 5 122.211 ± 81.618 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate 100000 thrpt 5 377.099 ± 77.146 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm 100000 thrpt 5 4364450.973 ± 2856586.882 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space 100000 thrpt 5 381.570 ± 119.260 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm 100000 thrpt 5 4415115.428 ± 3000198.792 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen 100000 thrpt 5 1.914 ± 16.479 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm 100000 thrpt 5 31833.830 ± 274098.881 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space 100000 thrpt 5 12.117 ± 20.931 MB/sec TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm 100000 thrpt 5 136001.918 ± 196459.666 B/op TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count 100000 thrpt 5 52.000 counts TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time 100000 thrpt 5 3698.000 ms TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack 100000 thrpt NaN --- ``` Thanks, Bill Farner
