Re: Acking of anchor tuple list decreases throughput?

John Yost Sat, 30 Jan 2016 14:33:52 -0800

Hey Kashyap,

Thanks a bunch for adding your thoughts and it sounds like we are facing a
very similar issue, where one tuple from the spout going into Bolt A
generates 15-20 tuples in the Bolt A execute method. The difference appears
to be I bin the 15-20 tuples generated by Bolt A, and the reason I do that
is Bolt B sends tuples to Bolt C via fieldsGrouping, and the binning has
helped my throughput quite a bit.


I also experimented with the buffers, LMAX strategies, max spout pending,
as well as the number of threads processing incoming/outgoing Netty
messages (hoping the latter would speed up the Netty -> LMAX transfer), and
none of these made a significant impact for my topology, either.

--John

On Sat, Jan 30, 2016 at 3:40 PM, Kashyap Mhaisekar <[email protected]>
wrote:

> John, Nick
> Thanks for broaching this topic. In my case, 1 tuple from spout gives out
> 200 more tuples. I too see the same class listed in VisualVM profiling...
> And tried bringing this down... I reduced parallelism hints, played with
> buffers, changed lmax strategies, changed max spout pending... Nothing
> seems to have an impact
>
> Any ideas on what could be done for this?
>
> Thanks
> Kashyap
> Hello John,
>
> First off, let us agree on your definition of throughput. Do you define
> throughput as the average number of tuples each of your last bolts (sinks)
> emit per second? If yes, then OK. Otherwise, please provide us with more
> details.
>
> Going back to the BlockingWaitStrategy observation you have, it (most
> probably) means that since you are producing a large number of tuples
> (15-20 tuples) the outgoing Disruptor queue gets full, and the emit()
> function blocks. Also, since you are anchoring tuples (that might mean
> exactly-once semantics), it basically takes more time to place something in
> the queue, in order to guarantee deliver of all tuples to a downstream
> bolt.
>
> Therefore, it makes sense to see so much time spent in the LMAX messaging
> layer. A good experiment to verify your hypothesis, is to not anchor
> tuples, and profile your topology again. However, I am not sure that you
> will see a much different percentage, since for every tuple you are
> receiving, you have at least one call to the Disruptor layer. Maybe in your
> case (if I got it correctly from your description), you should have one
> call every N tuples, where N is the size of your bin in tuples. Right?
>
> I hope I helped with my comments.
>
> Cheers,
> Nick
>
> On Sat, Jan 30, 2016 at 12:16 PM, John Yost <[email protected]> wrote:
>
>> Hi Everyone,
>>
>> I have a large fan-out that I've posted questions about before with the
>> following new, updated info:
>>
>> 1. Incoming tuple to Bolt A produces 15-20 tuples
>> 2. Bolt A emits to Bolt B via fieldsGrouping
>> 3. I cache outgoing tuples in bins within Bolt A and then emit anchored
>> tuples to Bolt B with the OutputCollector *emit
>> <http://storm.apache.org/apidocs/backtype/storm/task/OutputCollector.html#emit(java.util.Collection,%20java.util.List)>*
>> (Collection
>> <http://docs.oracle.com/javase/6/docs/api/java/util/Collection.html?is-external=true>
>> <Tuple <http://storm.apache.org/apidocs/backtype/storm/tuple/Tuple.html>
>> > anchors, List
>> <http://docs.oracle.com/javase/6/docs/api/java/util/List.html?is-external=true>
>> <Object
>> <http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html?is-external=true>
>> > tuple) method
>> 4. I have throughput where I need it to be if I just receive tuples in
>> Bolt B, ack, and drop. If I do actual processing in Bolt B, throughput
>> degrades a bunch.
>> 5. I profiled the Bolt B worker yesterday and see that over 90% is spent
>> in com.lmax.disruptor.BlockingWaitStrategy--irrespective if I drop the
>> tuples or process in Bolt B
>>
>> I am wondering if the acking of the anchor tuples is what's resulting in
>> so much time spent in the LMAX messaging layer.  What do y'all think?  Any
>> ideas appreciated as always.
>>
>> Thanks! :)
>>
>> --John
>>
>
>
>
> --
> Nick R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh
>

Re: Acking of anchor tuple list decreases throughput?

Reply via email to