Re: Decreasing Complete latency with growing number of executors

Dima Dragan Wed, 20 May 2015 05:13:14 -0700

Nathan,

Process and execute latency are growing, should it mean that we spend more
time for processing tuple, cause it spends more time in bolt queue?


I thought that "Complete latency" and "Process latency" should be
correlated. Am I right?


On Wed, May 20, 2015 at 2:10 PM, Nathan Leung <[email protected]> wrote:

> My point with increased throughput was that if you have items queued from
> the spout waiting to be processed, that counts towards the complete latency
> for the spout. If your bolts go through the tuples faster (and as you add
> more they do, you have 6x speedup from more bolts) then you will see the
> complete latency drop.
> On May 20, 2015 4:01 AM, "Dima Dragan" <[email protected]> wrote:
>
>> Thank you, Jeffrey and Devang for your answers.
>>
>> Jeffrey, as far as I use shuffle grouping, I think, network serialization
>> will left, but there will be no network delays (for remove it there is
>> localOrShuffling grouping). For all experiments, I use only one worker, so
>> it does not explain why complete latency could decrease.
>>
>> But I think you are right about definitions)
>>
>> Devang, no, I set up 1 worker and 1 acker for all tests.
>>
>>
>> Best regards,
>> Dmytro Dragan
>> On May 20, 2015 05:03, "Devang Shah" <[email protected]> wrote:
>>
>>> Was the number of workers or number of ackers changed across your
>>> experiments ? What are the numbers you used ?
>>>
>>> When you have many executors, increasing the ackers reduces the complete
>>> latency.
>>>
>>> Thanks and Regards,
>>> Devang
>>>  On 20 May 2015 03:15, "Jeffery Maass" <[email protected]> wrote:
>>>
>>>> Maybe the difference has to do with where the executors were running.
>>>> If your entire topology is running within the same worker, it would mean
>>>> that a serialization for the worker to worker networking layer is left out
>>>> of the picture.  I suppose that would mean the complete latency could
>>>> decrease.  At the same time, process latency could very well increase,
>>>> since all the work is being done within the same worker.  My understanding
>>>> that process latency is measured from the time the tuple enters the
>>>> executor until it leaves the executor.  Or was it from the time the tuple
>>>> enters the worker until it leaves the worker?  I don't recall.
>>>>
>>>> I bet a firm definition of the latency terms would shed some light.
>>>>
>>>> Thank you for your time!
>>>>
>>>> +++++++++++++++++++++
>>>> Jeff Maass <[email protected]>
>>>> linkedin.com/in/jeffmaass
>>>> stackoverflow.com/users/373418/maassql
>>>> +++++++++++++++++++++
>>>>
>>>>
>>>> On Tue, May 19, 2015 at 9:47 AM, Dima Dragan <[email protected]>
>>>> wrote:
>>>>
>>>>> Thanks Nathan for your answer,
>>>>>
>>>>> But I`m afraid that you understand me wrong :  With increasing
>>>>> executors by 32x, each executor's throughput *increased* by 5x, but
>>>>> complete latency dropped.
>>>>>
>>>>> On Tue, May 19, 2015 at 5:16 PM, Nathan Leung <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> It depends on your application and the characteristics of the io. You
>>>>>> increased executors by 32x and each executor's throughput dropped by 5x, 
>>>>>> so
>>>>>> it makes sense that latency will drop.
>>>>>> On May 19, 2015 9:54 AM, "Dima Dragan" <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> I have found a strange behavior in topology metrics.
>>>>>>>
>>>>>>> Let`s say, we have 1 node, 2-core machine. simple Storm topology
>>>>>>> Spout A -> Bolt B -> Bolt C
>>>>>>>
>>>>>>> Bolt B splits message on 320 parts and  emits (shuffle grouping)
>>>>>>> each to Bolt C. Also Bolts B and C make some read/write operations to 
>>>>>>> db.
>>>>>>>
>>>>>>> Input flow is continuous and static.
>>>>>>>
>>>>>>> Based on logic, setting up a more higher number of executors for
>>>>>>> Bolt C than number of cores should be useless (the bigger part of 
>>>>>>> threads
>>>>>>> will be sleeping).
>>>>>>> It is confirmed by increasing execute and process latency.
>>>>>>>
>>>>>>> But I noticed that complete latency has started to decrease. And I
>>>>>>> do not understand why.
>>>>>>>
>>>>>>> For example, stats for bolt C:
>>>>>>>
>>>>>>> ExecutorsProcess latency (ms)Complete latency (ms)25.599897.27646.3
>>>>>>> 526.36428.432345.454
>>>>>>>
>>>>>>> Is it side effect of IO bound tasks?
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Dmytro Dragan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Dmytro Dragan
>>>>>
>>>>>
>>>>>
>>>>


-- 
Best regards,
Dmytro Dragan

Re: Decreasing Complete latency with growing number of executors

Reply via email to