Re: Storm sending rate

Walid Aljoby Fri, 25 Nov 2016 07:16:16 -0800

Hi Navin,
Thank you for deep details. Interesting!I have some questions, I was wondering 
if you could inline the answers, please.
- What is the intuition for the average latency/ tuple to be within 2.5 
seconds?- Is the number of slots per node limited to 4?- Acutally, I use 
WordCount Topology which consists of some sentence stored inside the spout 
function, and it is possible to use multiple instance of spout. However,   I 
could not realize why reading the tuples from the queues is better for fast 
emitting?

Sorry for long questions.
--Best RegardsWA

      From: Navin Ipe <[email protected]>
 To: Walid Aljoby <[email protected]> 
Cc: "[email protected]" <[email protected]>
 Sent: Thursday, November 24, 2016 3:09 PM
 Subject: Re: Storm sending rate

Ideally, each time nextTuple is called, you should be emitting only one tuple. 
Of course, you can emit more than one, but then it would be better to monitor 
the latency and emit only as many tuples which can be ack'ed within a latency 
of 2.5 second.
Make sure you have enough of workers
Increase TOPOLOGY_MESSAGE_TIMEOUT_SECS
Increase stormConfig.setNumWorkers(someNumber); and 
stormConfig.setNumAckers(someNumber);
Each storm node will have 4 slots which can handle 4 workers, so create as many 
workers as you have slots. Slots = number of nodes * 4. If you have more 
workers than slots, then Storm will have to handle more than one worker on a 
single slot, which will be a little slower.
Having number of workers = number of tasks (number of spouts and bolts) is also 
helpful to avoid lags.

If you really want to increase the number of emits phenomenally, then use a 
separate program to put objects into a queue like RabbitMQ or any of the other 
queue programs available. Then, create multiple spout instances which will read 
from this queue and emit. This way, you'll have multiple spouts emitting 
tuples, and you can have multiple bolts which take tuples from these spouts and 
process the data.

On Thu, Nov 24, 2016 at 11:02 AM, Walid Aljoby <[email protected]> wrote:

Hi Navin,
Yes, I meant by the sending rate; the outgoing tuples from the spout, as the 
Representative for data source, to the computation bolts. The question about 
tuning the respective parameters for increasing the spout emitting tuples. 
Actually, I tried different values for max spout pending, but not much 
improvement in the application throughput. Hence, I asked if other parameters 
affect the speed of emitting tuples. 
Thank you and Regards,--WA

      From: Navin Ipe <navin.ipe@searchlighthealth. com>
 To: [email protected]; Walid Aljoby <[email protected]> 
 Sent: Thursday, November 24, 2016 12:54 PM
 Subject: Re: Storm sending rate

Please remember that we cannot read your mind. A little more elaboration on 
what problem you are facing and what you mean by "sending rate" would help.

On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <[email protected]> wrote:

Hi everyone,
Could anyone has an experience to explain the factors affecting sending rate in 
Storm?

Thank you--RegardsWA

-- 
Regards,Navin

-- 
Regards,Navin

Re: Storm sending rate

Reply via email to