Github user clockfly commented on the pull request:
https://github.com/apache/incubator-storm/pull/103#issuecomment-45072821
@Gvain,
There are two parts in your question,
1. In your test, why throughput drops when worker number increase after
reaching a value(8 in your test case)?
For this one, it is because your CPU reach limit for worker# = 8 (CPU
usage: 89%), In this case, adding more workers will just adding more threads
and context switch, hurting performance. While for my case, I have more
powerful CPU, and allow more parallel workers.
2. why there are performance difference when scaling worker# from 4 to 8,
in two different environment?
I don't know the answer. But I guess it may be caused by the difference
in hardware. You env is "bonded 1Gb network card"(2Gb) bandwith is twice
mine, and CPU is 24 core, half of mine.
Suppose we can model the message transfering pipeline as three layers:
netty layer(throughput somewhat impacted by NIC bandwidth) ->
intermediate layer( worker intermediate receiving pipes: netty server handler
-> decoding-> receiver thread ) -> task processing (througput impact by CPU).
For your env, CPU is relative at shortage, effective network bandwidth is
rich(effective bandwith is measured by theory_bandwidth *
network_efficiency_factor), the performance is throttled by the last layer.
While for my environment, CPU is rich, effective network bandwidth is not
enough(due to theory_bandwidth is only half), the performance is throttled by
the first two layers.
The patch mainly solved the first two layers.
1. Change netty Api from async -> sync and messaging API change will
improve the network_efficiency_factor, thus increasing the effective network
bandwidth.
2. Adding more receiver thread and optimization in netty server handler
will improve the second layer throughput.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---