-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68132/
-----------------------------------------------------------

(Updated Aug. 11, 2018, 6:09 p.m.)


Review request for mesos, Benno Evers and Benjamin Mahler.


Bugs: MESOS-9122
    https://issues.apache.org/jira/browse/MESOS-9122


Repository: mesos


Description
-------

With this patch handlers for '/state' requests are not scheduled
directly after authorization, but are accumulated and then scheduled
for later parallel processing.

This approach allows, if there are N '/state' requests in the Master's
mailbox and T is the request response time, to block the Master actor
only once for time O(T) instead of blocking it for time N*T prior to
this patch.

This batching technique reduces both the time Master is spending
answering '/state' requests and the average request response time
in presence of multiple requests in the Master's mailbox. However,
for seldom '/state' requests that don't accumulate in the Master's
mailbox, the response time might increase due to an added trip
through the mailbox.

The change preserves the read-your-writes consistency model.


Diffs
-----

  src/master/http.cpp d43fbd689598612ec5946b46e2fa5e7f5e22cfa8 
  src/master/master.hpp 209b998db8d2bad7a3812df44f0939458f48eb11 


Diff: https://reviews.apache.org/r/68132/diff/2/


Testing
-------

`make check` on Mac OS 10.13.5 and various Linux distros.

Run `MasterStateQueryLoad_BENCHMARK_Test.v0State` benchmark and 
`MasterStateQuery_BENCHMARK_Test.GetState`, see below.

**Setup**
Processor: Intel i7-4980HQ 2.8 GHz with 6 MB on-chip L3 cache and 128 MB L4 
cache (Crystalwell)
Total Number of Cores: 4
Total Number of Cores: 8
L2 Cache (per Core): 256 KB  

Compiler: Apple LLVM version 9.1.0 (clang-902.0.39.2)
Optimization: -O2

**MasterStateQuery_BENCHMARK_Test.GetState, v0 '/state' response time**

setup                                                    | no batching | 
batching
---------------------------------------------------------|-------------|----------
 1000 agents,  10000 running, and  10000 completed tasks | 146.496ms   | 
158.319ms
10000 agents, 100000 running, and 100000 completed tasks | 1.795s      | 1.899s
20000 agents, 200000 running, and 200000 completed tasks | 3.742s      | 4.427s
40000 agents, 400000 running, and 400000 completed tasks | 10.946s     | 11.096s

**MasterStateQueryLoad_BENCHMARK_Test.v0State, setup 1**
Test setup 1: 100 agents with a total of 10000 running tasks and 10000 
completed tasks; 50 '/state' and '/flags' requests will be sent in parallel 
with 200ms interval, i.e., total **50 measurements** per endpoint.

/flags | no batching | batching       /state | no batching | batching
-------------------------------   *   --------------------------------
   min |  1.598ms    | 1.475ms           min | 100.627ms   | 105.383ms
   p25 |  2.370ms    | 2.452ms           p25 | 102.206ms   | 107.184ms
   p50 |  2.520ms    | 2.562ms           p50 | 103.213ms   | 108.468ms
   p75 |  2.623ms    | 2.665ms           p75 | 104.100ms   | 109.808ms
   p90 |  2.803ms    | 2.731ms           p90 | 106.079ms   | 111.043ms
   max | 84.957ms    | 2.934ms           max | 153.438ms   | 154.636ms

**MasterStateQueryLoad_BENCHMARK_Test.v0State, setup 2**
Test setup 2: 1000 agents with a total of 100000 running tasks and 100000 
completed tasks; 10 '/state' and '/flags' requests will be sent in parallel 
with 200ms interval, i.e., total **10 measurements** per endpoint.

/flags | no batching | batching       /state | no batching | batching
--------------------------------  *   -------------------------------
   min | 2.309ms     |   1.579ms         min | 1.512s      | 2.820s
   p25 | 1.547s      | 373.609ms         p25 | 3.262s      | 3.588s
   p50 | 3.189s      | 831.261ms         p50 | 5.052s      | 4.253s
   p75 | 5.346s      |   2.215s          p75 | 6.846s      | 4.510s
   p90 | 5.854s      |   2.351s          p90 | 7.883s      | 4.705s
   max | 7.237s      |   2.444s          max | 8.517s      | 4.934s


Thanks,

Alexander Rukletsov

Reply via email to