[
https://issues.apache.org/jira/browse/STORM-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952050#comment-14952050
]
ASF GitHub Bot commented on STORM-855:
--------------------------------------
Github user revans2 commented on the pull request:
https://github.com/apache/storm/pull/765#issuecomment-147124664
I have some new test results. I did a comparison of several different
branches. I looked at this branch, the upgraded-disruptor branch #750,
STORM-855 #694, and apache-master 0.11.0-SNAPSHOT
(04cf3f6162ce6fdd1ec13b758222d889dafd5749). I had to make a few modifications
to get my test to work. I applied the following patch
https://gist.github.com/revans2/84301ef0fde0dc4fbe44 to each of the branches.
For STORM-855 I had to modify the test a bit so it would optionally do
batching. In that case batching was enabled on all streams and all spouts and
bolts.
I then ran the test at various throughputs 100, 200, 400, 800, 1600, 3200,
6400, 10000, 12800, 25600. and possibly a few others when looking for it to hit
the maximum throughput, and different batch sizes.
Each test ran for 5 mins. Here is the results of that test, excluding the
tests where the worker could not keep up with the rate.
| 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns |
avg service latency ms | std-dev ns |
|---|---|---|---|---|---|---|
| 2,613,247 | 4,673,535 | 100 | STORM-855-0 | 2,006,347.25 | 1.26 |
2,675,778.36 |
| 2,617,343 | 4,423,679 | 200 | STORM-855-0 | 1,991,238.45 | 1.29 |
2,024,687.45 |
| 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 |
1,778,335.92 |
| 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 |
893,085.40 |
| 2,635,775 | 8,560,639 | 800 | STORM-855-0 | 2,010,286.65 | 1.35 |
2,134,795.12 |
| 2,654,207 | 302,252,031 | 3200 | STORM-855-0 | 2,942,360.75 | 2.13 |
16,676,136.60 |
| 2,684,927 | 124,190,719 | 3200 | batch-v2-1 | 2,154,234.45 | 1.41 |
6,219,057.66 |
| 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 |
18,274,805.30 |
| 2,715,647 | 7,356,415 | 100 | storm-base-1 | 2,092,991.53 | 1.30 |
2,447,956.21 |
| 2,723,839 | 4,587,519 | 400 | storm-base-1 | 2,082,835.21 | 1.31 |
1,978,424.49 |
| 2,723,839 | 6,049,791 | 100 | dist-upgraade-1 | 2,091,407.68 | 1.31 |
2,222,977.89 |
| 2,725,887 | 10,403,839 | 1600 | batch-v2-1 | 2,010,694.30 | 1.27 |
2,095,223.90 |
| 2,725,887 | 4,607,999 | 200 | storm-base-1 | 2,074,784.50 | 1.30 |
1,951,564.93 |
| 2,727,935 | 4,513,791 | 200 | dist-upgraade-1 | 2,082,025.31 | 1.33 |
2,057,591.08 |
| 2,729,983 | 4,182,015 | 400 | dist-upgraade-1 | 2,056,282.29 | 1.43 |
862,428.67 |
| 2,732,031 | 4,632,575 | 800 | storm-base-1 | 2,092,514.39 | 1.27 |
2,231,550.66 |
| 2,734,079 | 4,472,831 | 800 | dist-upgraade-1 | 2,095,994.08 | 1.28 |
1,870,953.62 |
| 2,740,223 | 4,192,255 | 200 | batch-v2-1 | 2,011,025.19 | 1.21 |
911,556.19 |
| 2,742,271 | 4,726,783 | 1600 | storm-base-1 | 2,089,581.40 | 1.35 |
2,410,668.79 |
| 2,748,415 | 4,444,159 | 400 | batch-v2-1 | 2,055,600.78 | 1.34 |
1,729,257.92 |
| 2,748,415 | 4,575,231 | 100 | batch-v2-1 | 2,035,920.21 | 1.31 |
1,213,874.52 |
| 2,754,559 | 16,875,519 | 1600 | dist-upgraade-1 | 2,098,441.13 | 1.35 |
2,279,870.41 |
| 2,754,559 | 3,969,023 | 800 | batch-v2-1 | 2,026,222.88 | 1.29 |
767,491.71 |
| 2,793,471 | 53,477,375 | 3200 | storm-base-1 | 2,147,360.05 | 1.42 |
3,668,366.37 |
| 2,801,663 | 147,062,783 | 3200 | dist-upgraade-1 | 2,358,863.31 | 1.59 |
7,574,577.81 |
| 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 |
7,777,381.54 |
| 13,369,343 | 15,122,431 | 3200 | batch-v2-100 | 10,699,832.23 | 10.02 |
1,623,949.38 |
| 13,418,495 | 15,392,767 | 800 | batch-v2-100 | 10,589,813.17 | 9.86 |
2,439,134.80 |
| 13,426,687 | 14,680,063 | 400 | batch-v2-100 | 10,738,973.68 | 10.03 |
2,298,229.99 |
| 13,484,031 | 14,368,767 | 200 | batch-v2-100 | 10,941,653.28 | 10.20 |
2,471,899.43 |
| 13,508,607 | 14,262,271 | 100 | batch-v2-100 | 11,099,257.68 | 10.35 |
1,658,054.66 |
| 13,524,991 | 14,376,959 | 1600 | batch-v2-100 | 10,723,471.83 | 10.00 |
1,477,621.07 |
| 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59
| 78,326,501.83 |
| 710,934,527 | 827,326,463 | 4000 | STORM-855-100 | 351,305,653.90 |
339.28 | 141,283,307.30 |
| 783,286,271 | 1,268,776,959 | 5000 | STORM-855-100 | 332,417,358.65 |
312.07 | 139,760,316.82 |
| 888,668,159 | 1,022,361,599 | 3200 | STORM-855-100 | 445,646,342.60 |
431.55 | 179,065,279.65 |
| 940,048,383 | 1,363,148,799 | 6400 | storm-base-1 | 20,225,300.17 | 17.17
| 134,848,974.52 |
| 1,043,333,119 | 1,409,286,143 | 10000 | batch-v2-1 | 22,750,840.18 | 6.13
| 146,235,076.73 |
| 1,209,008,127 | 1,786,773,503 | 6400 | dist-upgraade-1 | 28,588,397.01 |
24.70 | 181,801,409.69 |
| 1,747,976,191 | 1,946,157,055 | 1600 | STORM-855-100 | 738,741,774.85 |
734.75 | 374,194,675.56 |
| 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 |
51.67 | 497,027,226.45 |
| 3,374,317,567 | 3,892,314,111 | 10000 | dist-upgraade-1 | 141,866,760.39
| 69.39 | 589,014,777.73 |
| 3,447,717,887 | 3,869,245,439 | 10000 | storm-base-1 | 139,149,514.03 |
56.45 | 609,509,456.98 |
| 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 |
93.37 | 743,434,065.83 |
| 3,512,729,599 | 3,898,605,567 | 800 | STORM-855-100 | 1,354,193,514.47 |
1,361.58 | 779,667,263.64 |
| 3,963,617,279 | 4,416,602,111 | 5500 | STORM-855-100 | 450,364,286.22 |
415.96 | 575,017,536.40 |
| 4,185,915,391 | 5,347,737,599 | 4500 | STORM-855-0 | 366,268,233.66 |
259.94 | 995,928,429.75 |
| 4,919,918,591 | 5,582,618,623 | 6000 | STORM-855-100 | 534,520,242.96 |
497.47 | 758,754,139.61 |
| 4,919,918,591 | 5,582,618,623 | 6000 | STORM-855-100 | 534,520,242.96 |
497.47 | 758,754,139.61 |
| 7,071,596,543 | 7,843,348,479 | 400 | STORM-855-100 | 2,652,137,010.52 |
2,630.51 | 1,589,666,333.78 |
| 14,159,970,303 | 15,653,142,527 | 200 | STORM-855-100 | 5,202,877,719.25
| 5,206.33 | 3,199,275,795.66 |
| 27,648,851,967 | 31,205,621,759 | 100 | STORM-855-100 | 10,201,124,134.76
| 10,169.37 | 6,289,786,882.10 |
I then filtered the list to show the maximum throughput for a given latency
(several different ones)
99th percentile:
| 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns |
avg service latency ms | std-dev ns |
|---|---|---|---|---|---|---|
| 2,613,247 | 4,673,535 | 100 | STORM-855-0 | 2,006,347.25 | 1.26 |
2,675,778.36 |
| 2,617,343 | 4,423,679 | 200 | STORM-855-0 | 1,991,238.45 | 1.29 |
2,024,687.45 |
| 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 |
1,778,335.92 |
| 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 |
893,085.40 |
| 2,654,207 | 302,252,031 | 3200 | STORM-855-0 | 2,942,360.75 | 2.13 |
16,676,136.60 |
| 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 |
18,274,805.30 |
| 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 |
7,777,381.54 |
| 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59
| 78,326,501.83 |
| 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 |
51.67 | 497,027,226.45 |
| 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 |
93.37 | 743,434,065.83 |
99.9th percentile:
| 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns |
avg service latency ms | std-dev ns |
|---|---|---|---|---|---|---|
| 2,754,559 | 3,969,023 | 800 | batch-v2-1 | 2,026,222.88 | 1.29 |
767,491.71 |
| 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 |
893,085.40 |
| 13,369,343 | 15,122,431 | 3200 | batch-v2-100 | 10,699,832.23 | 10.02 |
1,623,949.38 |
| 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 |
7,777,381.54 |
| 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59
| 78,326,501.83 |
| 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 |
51.67 | 497,027,226.45 |
| 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 |
93.37 | 743,434,065.83 |
mean latency:
| 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns |
avg service latency ms | std-dev ns |
|---|---|---|---|---|---|---|
| 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 |
893,085.40 |
| 2,793,471 | 53,477,375 | 3200 | storm-base-1 | 2,147,360.05 | 1.42 |
3,668,366.37 |
| 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 |
18,274,805.30 |
| 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 |
7,777,381.54 |
| 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59
| 78,326,501.83 |
| 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 |
51.67 | 497,027,226.45 |
| 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 |
93.37 | 743,434,065.83 |
service latency ms (storm's complete latency):
| 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns |
avg service latency ms | std-dev ns |
|---|---|---|---|---|---|---|
| 2,740,223 | 4,192,255 | 200 | batch-v2-1 | 2,011,025.19 | 1.21 |
911,556.19 |
| 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 |
1,778,335.92 |
| 2,725,887 | 10,403,839 | 1600 | batch-v2-1 | 2,010,694.30 | 1.27 |
2,095,223.90 |
| 2,684,927 | 124,190,719 | 3200 | batch-v2-1 | 2,154,234.45 | 1.41 |
6,219,057.66 |
| 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 |
18,274,805.30 |
| 1,043,333,119 | 1,409,286,143 | 10000 | batch-v2-1 | 22,750,840.18 | 6.13
| 146,235,076.73 |
| 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59
| 78,326,501.83 |
| 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 |
51.67 | 497,027,226.45 |
| 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 |
93.37 | 743,434,065.83 |
I also looked at about the maximum throughput each branch could handle.
| branch-batch | throughput | mean latency | 99%-lie latency |
|---|---|---|---|
| STORM-855-0 | 4,500 | 366,268,233.66 | 4,185,915,391 |
| STORM-855-100 | 5,500 | 450,364,286.22 | 3,963,617,279 |
| storm-base-1 | 10,000 | 139,149,514.03 | 3,447,717,887 |
| dist-upgrade-1 | 10,000 | 141,866,760.39 | 3,374,317,567 |
| batch-v2-1 | 10,000 | 22,750,840.18 | 1,043,333,119 |
| batch-v2-100 | 22,000 | 274,785,584.11 | 3,456,106,495 |
I really would like some feedback here, because these numbers seem to
contradict STORM-855 using my original speed of light test. I don't really
like that test, even though I wrote it, because the throughput is limited only
by storm, so with acking disabled it is measuring what the latency is when we
hit the wall, and cannot provide any more throughput. No one should run in
production that way. When acking is enabled and we are using max-spout pending
for flow control the throughput is directly related to the end to end latency.
This too shouldn't be the common case in production because it means we cannot
keep up with the incoming rate and are falling behind.
This seems to indicate that the only time STORM-855 makes since is when
looking at the 99%-ile latency at a very low throughput, and then it only seems
to save 1/20th of a ms advantage over the others. In other cases it looks like
the throughput per host it can support is about 1/2 of that without the change.
This branch however has a weakness on the low end when batching is enabled it
is about 12 ms slower, but on the high end it can handle more then 2x the
throughput with little change to the latency. If that 12 ms is important I
think we can mitigate it by allowing the batch size to self-adjust on a per
queue bases.
I really would like others to look at my numbers and my test to see if
there are issues with it that I am missing, because like I said it seems to
contradict the numbers from STORM-855. The only thing I can think of is that
the messaging layer is the bottleneck in the speed of light test, which is what
it was intended to stress test, and STORM-855 is giving a significant batching
advantage there. If that is the case then we should look at what STORM-855 is
doing around that to try and combine it with the batching we are doing here.
@ptgoetz @d2r @rfarivar @mjsax @kishorvpatil @knusbaum please let me know
what you think.
> Add tuple batching
> ------------------
>
> Key: STORM-855
> URL: https://issues.apache.org/jira/browse/STORM-855
> Project: Apache Storm
> Issue Type: New Feature
> Components: storm-core
> Reporter: Matthias J. Sax
> Assignee: Matthias J. Sax
> Priority: Minor
>
> In order to increase Storm's throughput, multiple tuples can be grouped
> together in a batch of tuples (ie, fat-tuple) and transfered from producer to
> consumer at once.
> The initial idea is taken from https://github.com/mjsax/aeolus. However, we
> aim to integrate this feature deep into the system (in contrast to building
> it on top), what has multiple advantages:
> - batching can be even more transparent to the user (eg, no extra
> direct-streams needed to mimic Storm's data distribution patterns)
> - fault-tolerance (anchoring/acking) can be done on a tuple granularity
> (not on a batch granularity, what leads to much more replayed tuples -- and
> result duplicates -- in case of failure)
> The aim is to extend TopologyBuilder interface with an additional parameter
> 'batch_size' to expose this feature to the user. Per default, batching will
> be disabled.
> This batching feature has pure tuple transport purpose, ie, tuple-by-tuple
> processing semantics are preserved. An output batch is assembled at the
> producer and completely disassembled at the consumer. The consumer output can
> be batched again, however, independent of batched or non-batched input. Thus,
> batches can be of different size for each producer-consumer pair.
> Furthermore, consumers can receive batches of different size from different
> producers (including regular non batched input).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)