[GitHub] storm pull request: Disruptor batching v2

revans2 Mon, 28 Sep 2015 14:00:19 -0700

Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/765#issuecomment-143872114
  
    I am just rebasing the code now so you can test that out yourself.  This 
code has no issues with acking.  But there are a few real issues.
    
    First the latency on low throughput queues is much higher.  This is because 
it has to wait for the batch to time out.  That timeout is set to 1 ms by 
default, so it is not that bad, but we should be able to do some on the fly 
adjustments in a follow on JIRA to dynamically adjust the batch size for each 
queue to compensate.
    
    Second the number of threads used is a lot more.  1 more per disruptor 
queue.  I expect to reduce the total number of disruptor queues once I have 
optimized other parts of the code.  As it stands right now I don't want to do 
that, because the two queues per bolt/spout design still improves performance 
in many cases.
    
    Third in the worst case situation it is possible to allocate many more 
objects than previously.  It is not actually that many more, we already 
allocate a lot of objects, which needs to be looked at on a separate JIRA at 
some point.
    
    Also I don't want to shove this code in without doing a real comparison 
between the two approaches and the code.  This is one way of doing batching, 
but there are others that may have advantages over this, or may compliment this 
approach as well.  I just want storm to eventually be a lot closer to the 1 
million sentences/second mark than it is now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Disruptor batching v2

Reply via email to