Re: Storm metrics under heavy load

Kutlu Araslı Wed, 13 May 2015 00:53:24 -0700

Thanks, that was really helpful.
As far as, i understand from all:
Number of tuples which are emited from MAX_SPOUT_PENDING are buffered in
spout and clock begins to count for complete latency. As complete latecy
increases for pending tuples in the spout, storm starts to replay tuples
which throttles cluster becuase of processing same items. So i will first
try to decrease MAX_SPOUT_PENDING in expense of throughput and observe
sutiation. Adding CPU's and increasing MAX_SPOUT_PENDING will be my next
shot.





12 May 2015 Sal, 22:44 tarihinde, Jeffery Maass <[email protected]> şunu
yazdı:

> Ok, I see now.
>
> So, everytime that Storm asks your spout for another tuple - your spout
> doesn't necessarily emit one.  Which means that your topology is
> necessarily not being "maxed out".  Or maybe better said, you are not
> experiencing topology behavior when MAX_SPOUT_PENDING has been reached and
> therefore used to limit the number of records processing within the
> topology.
>
> When you are seeing large numbers of tuples in Kestrel MQ, your spout is
> more likely being limited by MAX_SPOUT_PENDING.
>
> When you look at your bolts and spouts within the Storm UI, what number do
> you see for capacity?  The number will vary from 0 to 1.  The closer the
> number to 1, the fewer additional in process tuples you can expect to add
> to the topology and expect results.
>
> Note that there are 3 spout level Latencies :
> * per spout - complete latency milliseconds
> * per bolt  - process latency milliseconds
> * per bolt - execution latency milliseconds
>
> Complete Latency - how long does it take a tuple to flow all the way
> through the topology and back to the spout
> Process Latency - how long does it take a tuple to flow through the worker
> Execution latency - how long does it take a tuple to flow through a bolt's
> execute method
>
> Complete latency, therefore, is made up of the process latency and
> execution latency of every bolt in the topology, plus latency due to
> something else....I myself thing of this as the missing latency or system's
> latency.
>
> I've noticed that as you increase the number of in process tuples ( via
> MAX_SPOUT_PENDING ), that the complete latency increases much quicker than
> the execution and process latency of individual bolts.  In fact, what I
> have seen is that at a certain point of increasing in process tuples, the
> records processed per millisecond begins to drop.  An this appears to be
> solely related to the missing aka system latency.
>
> It sounds to me like what you are experiencing is this very thing.  I
> think that the solution is to add bolt instances, which then may lead you
> to adding cpu's.
>
>
> Thank you for your time!
>
> +++++++++++++++++++++
> Jeff Maass <[email protected]>
> linkedin.com/in/jeffmaass
> stackoverflow.com/users/373418/maassql
> +++++++++++++++++++++
>
>
> On Tue, May 12, 2015 at 9:08 AM, Kutlu Araslı <[email protected]> wrote:
>
>> I meant our tuple queues in Kestrel MQ which spout consumes.
>>
>>
>> 12 May 2015 Sal, 17:00 tarihinde, Jeffery Maass <[email protected]> şunu
>> yazdı:
>>
>> To what number / metric are you referring when you say, "When number of
>>> tuples increases in queue"?  What you are describing sounds like the
>>> beginning of queue explosion.  If so, increasing max spout pending will
>>> make the situation worse.
>>>
>>> Thank you for your time!
>>>
>>> +++++++++++++++++++++
>>> Jeff Maass <[email protected]>
>>> linkedin.com/in/jeffmaass
>>> stackoverflow.com/users/373418/maassql
>>> +++++++++++++++++++++
>>>
>>>
>>> On Tue, May 12, 2015 at 6:22 AM, Kutlu Araslı <[email protected]> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Our topology consumes tuples from a Kestrel MQ and runs a series of
>>>> bolts to process items including some db connections. Storm version is
>>>> 0.8.3 and supervisors are run on VMs.
>>>> When number of tuples increases in queue, we observe that, a single
>>>> tuple execution time also rise  dramatically in paralel which ends up with
>>>> a throttle behaviour.
>>>> In the meantime CPU and memory usage looks comfortable.From database
>>>> point, we have not observed a problem so far under stress.
>>>> Is there any configuration trick or an advice for handling such a load?
>>>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>
>

Re: Storm metrics under heavy load

Reply via email to