Re: Acking of anchor tuple list decreases throughput?

Nick R. Katsipoulakis Sat, 30 Jan 2016 12:57:35 -0800

Kashyap,

It is only expected to see the Disruptor dominating CPU time. It is the
object responsible for sending/receiving tuples (at least when you have
tuples produced by one executor thread for another executor thread on the
same machine). Therefore, it is expected to see Disruptor having something
like ~80% of the time.


A nice experiment to check my statement above is to create a Bolt that for
every tuple it receives, it performs a random CPU task (like nested for
loops) and it emits a tuple only after receiving X number of tuples, where
X > 1. Then, I expect that you will see the percentage of CPU time for the
Disruptor object to drop.

Cheers,
Nick

On Sat, Jan 30, 2016 at 3:40 PM, Kashyap Mhaisekar <[email protected]>
wrote:

> John, Nick
> Thanks for broaching this topic. In my case, 1 tuple from spout gives out
> 200 more tuples. I too see the same class listed in VisualVM profiling...
> And tried bringing this down... I reduced parallelism hints, played with
> buffers, changed lmax strategies, changed max spout pending... Nothing
> seems to have an impact
>
> Any ideas on what could be done for this?
>
> Thanks
> Kashyap
> Hello John,
>
> First off, let us agree on your definition of throughput. Do you define
> throughput as the average number of tuples each of your last bolts (sinks)
> emit per second? If yes, then OK. Otherwise, please provide us with more
> details.
>
> Going back to the BlockingWaitStrategy observation you have, it (most
> probably) means that since you are producing a large number of tuples
> (15-20 tuples) the outgoing Disruptor queue gets full, and the emit()
> function blocks. Also, since you are anchoring tuples (that might mean
> exactly-once semantics), it basically takes more time to place something in
> the queue, in order to guarantee deliver of all tuples to a downstream
> bolt.
>
> Therefore, it makes sense to see so much time spent in the LMAX messaging
> layer. A good experiment to verify your hypothesis, is to not anchor
> tuples, and profile your topology again. However, I am not sure that you
> will see a much different percentage, since for every tuple you are
> receiving, you have at least one call to the Disruptor layer. Maybe in your
> case (if I got it correctly from your description), you should have one
> call every N tuples, where N is the size of your bin in tuples. Right?
>
> I hope I helped with my comments.
>
> Cheers,
> Nick
>
> On Sat, Jan 30, 2016 at 12:16 PM, John Yost <[email protected]> wrote:
>
>> Hi Everyone,
>>
>> I have a large fan-out that I've posted questions about before with the
>> following new, updated info:
>>
>> 1. Incoming tuple to Bolt A produces 15-20 tuples
>> 2. Bolt A emits to Bolt B via fieldsGrouping
>> 3. I cache outgoing tuples in bins within Bolt A and then emit anchored
>> tuples to Bolt B with the OutputCollector *emit
>> <http://storm.apache.org/apidocs/backtype/storm/task/OutputCollector.html#emit(java.util.Collection,%20java.util.List)>*
>> (Collection
>> <http://docs.oracle.com/javase/6/docs/api/java/util/Collection.html?is-external=true>
>> <Tuple <http://storm.apache.org/apidocs/backtype/storm/tuple/Tuple.html>
>> > anchors, List
>> <http://docs.oracle.com/javase/6/docs/api/java/util/List.html?is-external=true>
>> <Object
>> <http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html?is-external=true>
>> > tuple) method
>> 4. I have throughput where I need it to be if I just receive tuples in
>> Bolt B, ack, and drop. If I do actual processing in Bolt B, throughput
>> degrades a bunch.
>> 5. I profiled the Bolt B worker yesterday and see that over 90% is spent
>> in com.lmax.disruptor.BlockingWaitStrategy--irrespective if I drop the
>> tuples or process in Bolt B
>>
>> I am wondering if the acking of the anchor tuples is what's resulting in
>> so much time spent in the LMAX messaging layer.  What do y'all think?  Any
>> ideas appreciated as always.
>>
>> Thanks! :)
>>
>> --John
>>
>
>
>
> --
> Nick R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh
>



-- 
Nick R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh

Re: Acking of anchor tuple list decreases throughput?

Reply via email to