Re: Realtime computations using storm - questions on performance

Nick R. Katsipoulakis Thu, 16 Jul 2015 13:53:40 -0700

Thank you all for the valuable info.

Unfortunately, I have to use it for my (research) prototype therefore I
have to go along with it.


Thank you again,
Nick

2015-07-16 16:33 GMT-04:00 Nathan Leung <[email protected]>:

> Storm task ids don't change:
> https://groups.google.com/forum/#!topic/storm-user/7P23beQIL4c
>
> On Thu, Jul 16, 2015 at 4:28 PM, Andrew Xor <[email protected]>
> wrote:
>
>> Direct grouping as it is shown in storm docs, means that you have to have
>> a specific task id and use "direct streams" which is error prone, probably
>> increase latency and might introduce redundancy problems as the producer of
>> tuple needs to know the id of the task the tuple will have to go; so
>> imagine a scenario where the receiving task fails for some reason and the
>> producer can't relay the tuples unless it received the re-spawned task's id.
>>
>> Hope this helps.
>>
>> Kindly yours,
>>
>> Andrew Grammenos
>>
>> -- PGP PKey --
>>  <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
>> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>>
>> On Thu, Jul 16, 2015 at 11:24 PM, Nick R. Katsipoulakis <
>> [email protected]> wrote:
>>
>>> Hello again,
>>>
>>> Nathan, I am using direct-grouping because the application I am working
>>> on has to be able to send tuples directly to specific tasks. In general
>>> control the data flow. Can you please explain to me why you would not
>>> recommend direct grouping? Is there any particular reason in the
>>> architecture of Storm?
>>>
>>> Thanks,
>>> Nick
>>>
>>> 2015-07-16 16:20 GMT-04:00 Nathan Leung <[email protected]>:
>>>
>>>> I would not recommend direct grouping unless you have a good reason for
>>>> it.  Shuffle grouping is essentially random with even distribution which
>>>> makes it easier to characterize its performance.  Local or shuffle grouping
>>>> stays in process so generally it will be faster.  However you have to be
>>>> careful in certain cases to avoid task starvation (e.g. you have kafka
>>>> spout with 1 partition on the topic and 1 spout task, feeding 10 bolt "A"
>>>> tasks in 10 worker processes). Direct grouping depends on your code (i.e.
>>>> you can create hotspots), fields grouping depends on your key distribution,
>>>> etc.
>>>>
>>>> On Thu, Jul 16, 2015 at 3:50 PM, Nick R. Katsipoulakis <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have two questions:
>>>>>
>>>>> 1) How do you exactly measure latency? I am doing the same thing and I
>>>>> have a problem getting the exact milliseconds of latency (mainly because 
>>>>> of
>>>>> clock drifting).
>>>>> 2) (to Nathan) Is there a difference in speeds among different
>>>>> groupings? For instance, is shuffle faster than direct grouping?
>>>>>
>>>>> Thanks,
>>>>> Nick
>>>>>
>>>>> 2015-07-15 17:37 GMT-04:00 Nathan Leung <[email protected]>:
>>>>>
>>>>>> Two things. Your math may be off depending on parallelism. One emit
>>>>>> from A becomes 100 emitted from C, and you are joining all of them.
>>>>>>
>>>>>> Second, try the default number of ackers (one per worker). All your
>>>>>> ack traffic is going to a single task.
>>>>>>
>>>>>> Also you can try local or shuffle grouping if possible to reduce
>>>>>> network transfers.
>>>>>> On Jul 15, 2015 12:45 PM, "Kashyap Mhaisekar" <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> We are attempting a real-time distributed computing using storm and
>>>>>>> the solution has only one problem - inter bolt latency on same
>>>>>>> machine or across machines ranges between 2 - 250 ms. I am not able to
>>>>>>> figure out why. Network latency is under 0.5 ms. By latency, I mean
>>>>>>> the time between an emit of one bolt/spout to getting the message in
>>>>>>> execute() of next bolt.
>>>>>>>
>>>>>>> I have a topology like the below -
>>>>>>> A (Spout) ->(Emits a number say 1000) -> B (bolt) [Receives this
>>>>>>> number and divides this into 10 emits of 100 each) -> C (bolt) [Recieves
>>>>>>> these emits and divides this to 10 emits of 10 numbers) -> D (bolt) 
>>>>>>> [Does
>>>>>>> some computation on the number and emits one message] -> E (bolt)
>>>>>>> [Aggregates all the data and confirms if all the 1000 messages are
>>>>>>> processed)
>>>>>>>
>>>>>>> Every bolt takes under 3 msec to complete and as a result, I
>>>>>>> estimated that the end to end processing for 1000 takes not more than 50
>>>>>>> msec including any latencies.
>>>>>>>
>>>>>>> *Observations*
>>>>>>> 1. The end to end time from Spout A to Bolt E takes 200 msec to 3
>>>>>>> seconds. My estimate was under 50 msec given that each bolt and spout 
>>>>>>> take
>>>>>>> under 3 msec to execute including any latencies.
>>>>>>> 2. I noticed that the most of the time is spent between Emit from a
>>>>>>> Spout/Bolt and execute() of the consuming bolt.
>>>>>>> 3. Network latency is under 0.5 msec.
>>>>>>>
>>>>>>> I am not able to figure out why it takes so much time between a
>>>>>>> spout/bolt to next bolt. I understand that the spout/bolt buffers the 
>>>>>>> data
>>>>>>> into a queue and then the subsequent bolt consumes from there.
>>>>>>>
>>>>>>> *Infrastructure*
>>>>>>> 1. 5 VMs with 4 CPU and 8 GB ram. Workers are with 1024 MB and there
>>>>>>> are 20 workers overall.
>>>>>>>
>>>>>>> *Test*
>>>>>>> 1. The test was done with 25 messages to the spout => 25 messages
>>>>>>> are sent to spout in a span of 5 seconds.
>>>>>>>
>>>>>>> *Config values*
>>>>>>> Config config = new Config();
>>>>>>> config.put(Config.TOPOLOGY_WORKERS, Integer.parseInt(20));
>>>>>>> config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384);
>>>>>>> config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384);
>>>>>>> config.put(Config.TOPOLOGY_ACKER_EXECUTORS, 1);
>>>>>>> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 8);
>>>>>>> config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);
>>>>>>>
>>>>>>> Please let me know if you have encountered similar issues and any
>>>>>>> steps you have taken to mitigate the time taken between spout/bolt and
>>>>>>> another bolt.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Kashyap
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nikolaos Romanos Katsipoulakis,
>>>>> University of Pittsburgh, PhD candidate
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Realtime computations using storm - questions on performance

Reply via email to