Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/704#issuecomment-142036022
  
    @mjsax Sorry about the confusion.  I have been playing around with batching 
and I was not referring to your code changes.  Disruptor offers two different 
ways of batching.  One is on the read side, which this patch is about.  That 
did not work in improving the efficiency/throughput of the queue.  Instead the 
batching has to happen on the enqueue side to reduce the load it is placing on 
the OS in signaling/waking up the worker thread.
    
    I have some micro benchmarks that show this, and how much of a performance 
improvement we can potentially expect to see.  I have been able to do over 
2,000,000 word count sentences per second on my Mac Book Pro laptop, where as 
storm is much lower.  I have not finished collecting numbers, and I am working 
on some code to do this in storm itself so we can see the impact it can have.
    
    The main reason why I like this approach over batching at the spout is that 
the code is isolated to just the queue itself, and should not have an impact of 
the rest of storm, except potentially to remove some of the other code changes 
that were put into storm initially to try and improve the throughput.
    
    But I really want to have a working prototype with performance numbers 
before I try to compare the two approaches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to