Hi Everyone, I have a large fan-out that I've posted questions about before with the following new, updated info:
1. Incoming tuple to Bolt A produces 15-20 tuples 2. Bolt A emits to Bolt B via fieldsGrouping 3. I cache outgoing tuples in bins within Bolt A and then emit anchored tuples to Bolt B with the OutputCollector *emit <http://storm.apache.org/apidocs/backtype/storm/task/OutputCollector.html#emit(java.util.Collection, java.util.List)>*(Collection <http://docs.oracle.com/javase/6/docs/api/java/util/Collection.html?is-external=true> <Tuple <http://storm.apache.org/apidocs/backtype/storm/tuple/Tuple.html> > anchors, List <http://docs.oracle.com/javase/6/docs/api/java/util/List.html?is-external=true> <Object <http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html?is-external=true> > tuple) method 4. I have throughput where I need it to be if I just receive tuples in Bolt B, ack, and drop. If I do actual processing in Bolt B, throughput degrades a bunch. 5. I profiled the Bolt B worker yesterday and see that over 90% is spent in com.lmax.disruptor.BlockingWaitStrategy--irrespective if I drop the tuples or process in Bolt B I am wondering if the acking of the anchor tuples is what's resulting in so much time spent in the LMAX messaging layer. What do y'all think? Any ideas appreciated as always. Thanks! :) --John
