Also, I would like to ask that when the target application demands ultra
low latency and removes any possible messaging delay, do you think setting
the batch size = 1 can help in this case? According to the source code, it
will flush the queue when its size comes to a predefined batch size or goes
after an interval. So to reduce the queue delay, I try to set the batch
size to 1 and the flush interval to 1ms. Meanwhile, I modify the waiting
strategy to yield.

Any other suggestions to reduce messaging delay introduced inside Storm?
ᐧ

On Wed, Jan 31, 2018 at 10:20 PM, Wuyang Zhang <[email protected]>
wrote:

> Hi Jungtaek and Roshan,
>
> Thank you for your replies and the suggestions.
>
> I just did a more detailed evaluation for this messaging delay issue.
>
> 1. I use ping and iPerf and get the numbers of 0.193ms and 934Mbps.
> 2. I use scp to transmit the image m0.png(1.7MB) with the delay of 0.072s.
> 3. I convert two m0.png to a single byte array with the length of
> 9331200(8.89MB).
> 4. I write a udp socket with the buffer size of 1024 to transmit the
> 8.89MB buffer in 80ms.
> 5. I send the 8.89MB buffer from the spout to a bolt in 93ms.
>
> The storm message will introduce 10ms more delay.
>
> I may get previous numbers because the heavy computing overhead and the
> tuple is buffered in the queue.
>
> Thanks a lot and motivate me to find this. I will try to consider
> encode/decode images to reduce the transmission latency.
>
> Best Regards,
> Wuyang
> ᐧ
>
> On Wed, Jan 31, 2018 at 7:18 PM, Roshan Naik <[email protected]>
> wrote:
>
>> Continuing with Jungtaek’s line of thinking… would also like to know
>>
>> -          What latencies are you able to achieve when directly
>> transmitting a 5mb image between two nodes (not using Storm) ?  And
>> similarly,… within two processes on the same node. ?
>>
>> -          And how are you measuring it ?
>>
>> -roshan
>>
>>
>>
>> *From: *Jungtaek Lim <[email protected]>
>> *Reply-To: *"[email protected]" <[email protected]>
>> *Date: *Wednesday, January 31, 2018 at 3:38 PM
>> *To: *"[email protected]" <[email protected]>
>> *Subject: *Re: Apache Storm High Messaging Delay When Passing 5MB Images
>>
>>
>>
>> I meant "easily seen" as "not exposed" and we are easy to miss to
>> consider.
>>
>>
>>
>> 2018년 2월 1일 (목) 오전 8:36, Jungtaek Lim <[email protected]>님이 작성:
>>
>> I'm not clear whether you're saying message transfer for each bolt took
>> 200ms, or summation of 4 or 5 times network transfer latencies were 200 ms.
>>
>>
>>
>> Why I say I'm not clear is that if it's latter and there's 5 times
>> network transfers, it is ideal latency in theory, since 1Gbps is 125MBps
>> (1000Mbps, not 1024Mbps) and 5M/125MBps = 40ms per each transfer. (I'd
>> rather suspect how it was possible in this case.)
>>
>>
>>
>> Even there's 4 times network transfers, we may need to take this to
>> account: the latency calculation above is in theory, and there're many
>> overheads other than messaging which is not easily seen, so the latency may
>> not be from messaging overhead.
>>
>>
>>
>> If it took 200 ms for each transfer that can be a thing to talk about.
>> Please let me know which is your case.
>>
>>
>>
>> Thanks,
>>
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>>
>> 2018년 2월 1일 (목) 오전 8:11, Wuyang Zhang <[email protected]>님이 작성:
>>
>> I am playing with Apache Storm for a real-time image processing
>> application which requires ultra low latency. In the topology definition, a
>> single spout will emit raw images(5MB) in every 1s and a few bolts will
>> process them. The processing latency of each bolt is acceptable and the
>> overall computing delay can be around 150ms.
>>
>> *However, I find that the message passing delay between workers on the
>> different nodes is really high. The overall such delay on the 5 successive
>> bolts is around 200ms.* To calculate this delay, I subtract all the task
>> latencies from the end-to-end latency. Moreover, I implement a timer bolt
>> and other processing bolts will register in this timer bolt to record the
>> timestamp before starting the real processing. By comparing the timestamps
>> of the bolts, I find the delay between each bolt is high as I previously
>> noticed.
>>
>> To analyze the source of this high additional delay, I firstly reduce the
>> sending interval to 1s and thus there should be no queuing delay due to the
>> high computing overheads. Also, from the Storm UI, I find none bolt is in
>> high CPU utilization.
>>
>> Then, I checked the network delay. I am using a 1Gbps network testbed and
>> test the network by RTT and bandwidth. The network latency should not be
>> that high to send a 5MB image.
>>
>> Finally, I am thinking about the buffer delay. I find each thread
>> maintains its own sending buffer and transfer the data to the worker's
>> sending buffer. I am not sure how long it takes before the receiver bolt
>> can get this sending message. As suggested by the community, I increase the
>> sender/receiver buffer size to 16384, modify STORM_NETTY_MESSAGE_BATCH_SIZE
>> to 32768. However, it did not help.
>>
>> *My question is that how to remove/reduce the messaging overheads between
>> bolts?(inter workers)* It is possible to synchronize the communication
>> between bolts and have the receiver got the sending messages immediately
>> without any delay?
>>
>>
>>
>> ᐧ
>>
>>
>

Reply via email to