Github user HeartSaVioR commented on the issue:
https://github.com/apache/storm/pull/1379
I'm just done with performance tests with 10 nodes which uses 7 nodes for
workers.
(Performance tests are being done with VMs so it can be affected to
environment. So I ran tests two times per option.)
Environment for each VM: 2 cores, 16G memory, RHEL7 64bit, Oracle JDK
1.8.0_60
I did performance tests via yahoo/storm-perf-test (SOL) which we used for
performance test before ThroughputVsLatency. I don't have experience with
ThroughputVsLatency with multiple nodes so I was not sure how to tune so just
picked SOL.
At first, I just used 1 worker for each VM, and made all tasks distributed
to all workers so that each workers have one task and one acker.
Test command line is here: `storm jar
storm_perf_test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
com.yahoo.storm.perftest.Main --ack --name test -l 1 -n 1 --workers 7 --spout 3
--bolt 4 --testTimeSec 900 -c topology.max.spout.pending=1092 --messageSize 10
-c topology.acker.executors=null`
Test result is here:
https://gist.github.com/HeartSaVioR/69078c3abb56561111288708d7dd6fab
After warming up, patched version performs more stable, and faster.
I was curious that how performance is changing if we take more pressures to
ackers. So I just made 4x tasks and ran test again.
Test command line is here: `storm jar
storm_perf_test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
com.yahoo.storm.perftest.Main --ack --name test -l 1 -n 1 --workers 7 --spout
12 --bolt 16 --testTimeSec 900 -c topology.max.spout.pending=1092 --messageSize
10 -c topology.acker.executors=null`
Test result is here:
https://gist.github.com/HeartSaVioR/9db168a2550abbf0d8f114269ec3aaa3
Similar results are observed.
We expected no performance affection or even degradation but actually it
improves the performance with SOL.
I guess this result comes from moving System.currentTimeMillis() from Spout
to Acker. It was called once for every 20 completed tuples "in Spout loop
thread" which is blocking. Even Acker is calling System.currentTimeMillis() to
every completed tuples and having heavier payload, it affects less negative to
performance.
@ptgoetz Could you check my test result and confirm that makes sense?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---