Github user clockfly commented on the pull request:

    https://github.com/apache/incubator-storm/pull/103#issuecomment-45072821
  
    @Gvain,
    
    There are two parts in your question,
    
    1. In your test, why throughput drops when worker number increase after 
reaching a value(8 in your test case)? 
      
       For this one, it is because your CPU reach limit for worker# = 8 (CPU 
usage: 89%), In this case, adding more workers will just adding more threads 
and context switch, hurting performance. While for my case, I have more 
powerful CPU, and allow more parallel workers.
    
    2. why there are performance difference when scaling worker# from 4 to 8, 
in two different  environment?
    
      I don't know the answer. But I guess it may be caused by the difference 
in hardware. You env is  "bonded 1Gb network card"(2Gb)  bandwith is twice 
mine, and CPU is 24 core, half of mine. 
    
      Suppose we can model the message transfering pipeline as three layers:
    
      netty layer(throughput somewhat impacted by NIC bandwidth) -> 
intermediate layer( worker intermediate receiving pipes: netty server handler 
-> decoding-> receiver thread ) -> task processing (througput impact by CPU).
    
      For your env, CPU is relative at shortage, effective network bandwidth is 
rich(effective bandwith is measured by theory_bandwidth * 
network_efficiency_factor), the performance is throttled by the last layer. 
While for my environment, CPU is rich, effective network bandwidth is not 
enough(due to theory_bandwidth is only half), the performance is throttled by 
the first two layers. 
    
      The patch mainly solved the first two layers.
    
      1. Change netty Api from async -> sync and messaging API change will 
improve the network_efficiency_factor, thus increasing the effective network 
bandwidth.
    
      2. Adding more receiver thread and optimization in netty server handler 
will improve the second layer throughput.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to