[
https://issues.apache.org/jira/browse/STORM-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711675#comment-14711675
]
ASF GitHub Bot commented on STORM-855:
--------------------------------------
Github user mjsax commented on the pull request:
https://github.com/apache/storm/pull/694#issuecomment-134682336
Here are some additional benchmark results with larger `--messageSize` (ie,
100 and 250). Those benchmarks are run in a 12 node cluster (with
`nimbus.thrift.max_buffer_size: 33554432` and `worker.childopts: "-Xmx2g"`) as
follows:
`storm jar target/storm_perf_test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
com.yahoo.storm.perftest.Main --name test -l 1 -n 1 --messageSize 100 --workers
24 --spout 1 --bolt 10 --testTimeSec 300`
As you can observe, the output rate is slightly reduced for larger tuple
size in all cases (what is expected due to larger serialization costs). That
batching case with batch size 100 and`--messageSize 250` does not improve
output rate as significant as before, because it hits the underlaying 1Gbit
Ethernet limit. In the beginning, it starts promising. After about 2 minutes
performance does down. Because tuples cannot be transfered over the network
fast enough, they get buffered in main memory which reaches it's limit
(assumption) and thus Storm slows down the spout. Not sure, why performances
goes down in each report. Any ideas?
`--messageSize 100`
**master branch**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440523358523 0 0
0.0
WAITING 1 144 11 11 11 1440523388523 30000
6336040 20.14172871907552
WAITING 1 144 11 11 11 1440523418523 30000
10554200 33.55089823404948
RUNNING 1 144 11 11 11 1440523448523 30000
10005840 31.807708740234375
RUNNING 1 144 11 11 11 1440523478523 30000
10514120 33.4234873453776
RUNNING 1 144 11 11 11 1440523508523 30000
10430700 33.158302307128906
RUNNING 1 144 11 11 11 1440523538523 30000
7608960 24.188232421875
RUNNING 1 144 11 11 11 1440523568523 30000
10279460 32.67752329508463
RUNNING 1 144 11 11 11 1440523598524 30001
10496260 33.36559974774929
RUNNING 1 144 11 11 11 1440523628523 29999
10382860 33.00732328691556
RUNNING 1 144 11 11 11 1440523658523 30000
10138280 32.22872416178385
RUNNING 1 144 11 11 11 1440523688523 30000
10072940 32.02101389567057
RUNNING 1 144 11 11 11 1440523718523 30000
10095820 32.09374745686849
```
**batching branch (no batching)**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440520763917 0 0
0.0
WAITING 1 144 11 11 11 1440520793917 30000
4467900 14.203071594238281
WAITING 1 144 11 11 11 1440520823917 30000
10616160 33.74786376953125
RUNNING 1 144 11 11 11 1440520853917 30000
10473700 33.294995625813804
RUNNING 1 144 11 11 11 1440520883917 30000
10556860 33.55935414632162
RUNNING 1 144 11 11 11 1440520913917 30000
10580760 33.63533020019531
RUNNING 1 144 11 11 11 1440520943917 30000
10367580 32.95764923095703
RUNNING 1 144 11 11 11 1440520973917 30000
10646760 33.84513854980469
RUNNING 1 144 11 11 11 1440521003917 30000
10750300 34.17428334554037
RUNNING 1 144 11 11 11 1440521033917 30000
10607220 33.719444274902344
RUNNING 1 144 11 11 11 1440521063917 30000
10456920 33.24165344238281
RUNNING 1 144 11 11 11 1440521093917 30000
10108000 32.132466634114586
RUNNING 1 144 11 11 11 1440521123917 30000
10576120 33.6205800374349
```
**batching branch (batch size: 100)**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440521937049 0 0
0.0
WAITING 1 144 11 11 11 1440521967049 30000
11346480 36.069488525390625
WAITING 1 144 11 11 11 1440521997050 30001
17333500 55.0998758822297
RUNNING 1 144 11 11 11 1440522027049 29999
17815260 56.635074176137906
RUNNING 1 144 11 11 11 1440522057049 30000
17993660 57.200304667154946
RUNNING 1 144 11 11 11 1440522087049 30000
17720880 56.333160400390625
RUNNING 1 144 11 11 11 1440522117050 30001
17957200 57.08249869861108
RUNNING 1 144 11 11 11 1440522147049 29999
18286500 58.13315572840058
RUNNING 1 144 11 11 11 1440522177050 30001
18027820 57.306986149778076
RUNNING 1 144 11 11 11 1440522207049 29999
12470520 39.64403692199896
RUNNING 1 144 11 11 11 1440522237049 30000
17256760 54.8577626546224
RUNNING 1 144 11 11 11 1440522267049 30000
17288300 54.95802561442057
RUNNING 1 144 11 11 11 1440522297049 30000
17077300 54.28727467854818
```
`--messageSize 250`
**master branch**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440523772498 0 0
0.0
WAITING 1 144 11 11 11 1440523802498 30000
6047420 48.06057612101237
WAITING 1 144 11 11 11 1440523832498 30000
9909100 78.7504514058431
RUNNING 1 144 11 11 11 1440523862498 30000
9846400 78.25215657552083
RUNNING 1 144 11 11 11 1440523892498 30000
7100940 56.43320083618164
RUNNING 1 144 11 11 11 1440523922498 30000
9870500 78.4436861673991
RUNNING 1 144 11 11 11 1440523952498 30000
9856460 78.33210627237956
RUNNING 1 144 11 11 11 1440523982498 30000
9662740 76.79255803426106
RUNNING 1 144 11 11 11 1440524012498 30000
10060860 79.9565315246582
RUNNING 1 144 11 11 11 1440524042499 30001
10049660 79.86485975980162
RUNNING 1 144 11 11 11 1440524072498 29999
9937680 78.98021751278428
RUNNING 1 144 11 11 11 1440524102498 30000
10061600 79.96241251627605
RUNNING 1 144 11 11 11 1440524132498 30000
9804960 77.92282104492188
```
**batching branch (no batching)**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440521276880 0 0
0.0
WAITING 1 144 11 11 11 1440521306880 30000
4983420 39.60466384887695
WAITING 1 144 11 11 11 1440521336880 30000
9760600 77.57027943929036
RUNNING 1 144 11 11 11 1440521366881 30001
9344540 74.26125626338171
RUNNING 1 144 11 11 11 1440521396880 29999
9323920 74.10232867951068
RUNNING 1 144 11 11 11 1440521426880 30000
9354460 74.3425687154134
RUNNING 1 144 11 11 11 1440521456880 30000
9601440 76.30538940429688
RUNNING 1 144 11 11 11 1440521486880 30000
9476900 75.3156344095866
RUNNING 1 144 11 11 11 1440521516880 30000
9581560 76.14739735921223
RUNNING 1 144 11 11 11 1440521546881 30001
9494500 75.4529915429414
RUNNING 1 144 11 11 11 1440521576880 29999
9260020 73.59448017774095
RUNNING 1 144 11 11 11 1440521606880 30000
9054000 71.95472717285156
RUNNING 1 144 11 11 11 1440521636880 30000
9278340 73.73762130737305
RUNNING 1 144 11 11 11 1440521666880 30000
9153180 72.74293899536133
```
**batching branch (batch size: 100)**
```
status topologies totalSlots slotsUsed totalExecutors
executorsWithMetrics time time-diff ms transferred throughput
(MB/s)
WAITING 1 144 0 11 0 1440522380492 0 0
0.0
WAITING 1 144 11 11 11 1440522410492 30000
10616760 84.37442779541016
WAITING 1 144 11 11 11 1440522440492 30000
15851220 125.97417831420898
RUNNING 1 144 11 11 11 1440522470492 30000
12884860 102.39966710408528
RUNNING 1 144 11 11 11 1440522500492 30000
8743860 69.48995590209961
RUNNING 1 144 11 11 11 1440522530492 30000
6445660 51.225503285725914
RUNNING 1 144 11 11 11 1440522560492 30000
5974480 47.48090108235677
RUNNING 1 144 11 11 11 1440522590493 30001
5198640 41.3137016119645
RUNNING 1 144 11 11 11 1440522620492 29999
5209800 41.405150618464624
RUNNING 1 144 11 11 11 1440522650492 30000
4585960 36.445935567220054
RUNNING 1 144 11 11 11 1440522680492 30000
3870140 30.75710932413737
RUNNING 1 144 11 11 11 1440522710492 30000
3751040 29.810587565104168
RUNNING 1 144 11 11 11 1440522740493 30001
3450600 27.421990901898322
```
> Add tuple batching
> ------------------
>
> Key: STORM-855
> URL: https://issues.apache.org/jira/browse/STORM-855
> Project: Apache Storm
> Issue Type: New Feature
> Reporter: Matthias J. Sax
> Assignee: Matthias J. Sax
> Priority: Minor
>
> In order to increase Storm's throughput, multiple tuples can be grouped
> together in a batch of tuples (ie, fat-tuple) and transfered from producer to
> consumer at once.
> The initial idea is taken from https://github.com/mjsax/aeolus. However, we
> aim to integrate this feature deep into the system (in contrast to building
> it on top), what has multiple advantages:
> - batching can be even more transparent to the user (eg, no extra
> direct-streams needed to mimic Storm's data distribution patterns)
> - fault-tolerance (anchoring/acking) can be done on a tuple granularity
> (not on a batch granularity, what leads to much more replayed tuples -- and
> result duplicates -- in case of failure)
> The aim is to extend TopologyBuilder interface with an additional parameter
> 'batch_size' to expose this feature to the user. Per default, batching will
> be disabled.
> This batching feature has pure tuple transport purpose, ie, tuple-by-tuple
> processing semantics are preserved. An output batch is assembled at the
> producer and completely disassembled at the consumer. The consumer output can
> be batched again, however, independent of batched or non-batched input. Thus,
> batches can be of different size for each producer-consumer pair.
> Furthermore, consumers can receive batches of different size from different
> producers (including regular non batched input).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)