Re: Review Request 67871: Optimized the generation of metrics snapshots.

Greg Mann Tue, 17 Jul 2018 13:37:42 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67871/
-----------------------------------------------------------


(Updated July 17, 2018, 8:36 p.m.)


Review request for mesos, Benjamin Mahler, Gastón Kleiman, and James Peach.


Bugs: MESOS-9072
    https://issues.apache.org/jira/browse/MESOS-9072


Repository: mesos


Description
-------

Profiling of metrics generation revealed a large amount of time spent
in map operations. This patch does three things to mitigate this:

 * Stores the metrics as an ordered map so that we only pay the price
   of sorting when the metric is first added.
 * Makes use of vectors instead of maps for intermediate objects,
   which eliminates the need for another intermediate object.
 * Hints when inserting into the returned map, reducing the cost of
   insertion into that ordered container.


Diffs
-----

  3rdparty/libprocess/include/process/metrics/metrics.hpp 
f9b72029b2c85826c91b1d7656b0af94dc87010c 
  3rdparty/libprocess/src/metrics/metrics.cpp 
4883c9acaa0cc568e27944661a8208f7b2a356a1 


Diff: https://reviews.apache.org/r/67871/diff/5/


Testing
-------

WITH per-framework metrics, BEFORE optimizations:
```
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
Test setup: 1 agents with a total of 105 frameworks
unversioned /metrics/snapshot' response took 157.1449ms
v1 'master::call::GetMetrics' application/x-protobuf response took 152.599692ms
v1 'master::call::GetMetrics' application/json response took 198.918334ms
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
 (835 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
Test setup: 1 agents with a total of 1005 frameworks
unversioned /metrics/snapshot' response took 1.319444199secs
v1 'master::call::GetMetrics' application/x-protobuf response took 
1.257644596secs
v1 'master::call::GetMetrics' application/json response took 1.527225235secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
 (6553 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
Test setup: 1 agents with a total of 10005 frameworks
unversioned /metrics/snapshot' response took 15.479365874secs
v1 'master::call::GetMetrics' application/x-protobuf response took 
14.542866983secs
v1 'master::call::GetMetrics' application/json response took 18.05492789secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
 (75455 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
Test setup: 1 agents with a total of 20005 frameworks
unversioned /metrics/snapshot' response took 31.908301664secs
v1 'master::call::GetMetrics' application/x-protobuf response took 
32.128689785secs
v1 'master::call::GetMetrics' application/json response took 33.669376185secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
 (150440 ms)
```

WITH per-framework metrics, AFTER optimizations:
```
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
Test setup: 1 agents with a total of 105 frameworks
unversioned /metrics/snapshot' response took 104.577895ms
v1 'master::call::GetMetrics' application/x-protobuf response took 74.262533ms
v1 'master::call::GetMetrics' application/json response took 100.218618ms
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/0
 (562 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
Test setup: 1 agents with a total of 1005 frameworks
unversioned /metrics/snapshot' response took 921.175877ms
v1 'master::call::GetMetrics' application/x-protobuf response took 780.277639ms
v1 'master::call::GetMetrics' application/json response took 1.168651111secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/1
 (5424 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
Test setup: 1 agents with a total of 10005 frameworks
unversioned /metrics/snapshot' response took 10.2413387secs
v1 'master::call::GetMetrics' application/x-protobuf response took 
9.407992945secs
v1 'master::call::GetMetrics' application/json response took 10.582584848secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/2
 (57206 ms)
[ RUN      ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
Test setup: 1 agents with a total of 20005 frameworks
unversioned /metrics/snapshot' response took 19.930542079secs
v1 'master::call::GetMetrics' application/x-protobuf response took 
20.318739763secs
v1 'master::call::GetMetrics' application/json response took 22.853630899secs
[       OK ] 
AgentFrameworkTaskCountContentType/MasterMetricsQuery_BENCHMARK_Test.GetMetrics/3
 (116363 ms)
```


Thanks,

Greg Mann

Re: Review Request 67871: Optimized the generation of metrics snapshots.

Reply via email to