Hi Navin,
Thank you for deep details. Interesting!I have some questions, I was wondering
if you could inline the answers, please.
- What is the intuition for the average latency/ tuple to be within 2.5
seconds?- Is the number of slots per node limited to 4?- Acutally, I use
WordCount Topology which consists of some sentence stored inside the spout
function, and it is possible to use multiple instance of spout. However, I
could not realize why reading the tuples from the queues is better for fast
emitting?
Sorry for long questions.
--Best RegardsWA
From: Navin Ipe <[email protected]>
To: Walid Aljoby <[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Thursday, November 24, 2016 3:09 PM
Subject: Re: Storm sending rate
Ideally, each time nextTuple is called, you should be emitting only one tuple.
Of course, you can emit more than one, but then it would be better to monitor
the latency and emit only as many tuples which can be ack'ed within a latency
of 2.5 second.
Make sure you have enough of workers
Increase TOPOLOGY_MESSAGE_TIMEOUT_SECS
Increase stormConfig.setNumWorkers(someNumber); and
stormConfig.setNumAckers(someNumber);
Each storm node will have 4 slots which can handle 4 workers, so create as many
workers as you have slots. Slots = number of nodes * 4. If you have more
workers than slots, then Storm will have to handle more than one worker on a
single slot, which will be a little slower.
Having number of workers = number of tasks (number of spouts and bolts) is also
helpful to avoid lags.
If you really want to increase the number of emits phenomenally, then use a
separate program to put objects into a queue like RabbitMQ or any of the other
queue programs available. Then, create multiple spout instances which will read
from this queue and emit. This way, you'll have multiple spouts emitting
tuples, and you can have multiple bolts which take tuples from these spouts and
process the data.
On Thu, Nov 24, 2016 at 11:02 AM, Walid Aljoby <[email protected]> wrote:
Hi Navin,
Yes, I meant by the sending rate; the outgoing tuples from the spout, as the
Representative for data source, to the computation bolts. The question about
tuning the respective parameters for increasing the spout emitting tuples.
Actually, I tried different values for max spout pending, but not much
improvement in the application throughput. Hence, I asked if other parameters
affect the speed of emitting tuples.
Thank you and Regards,--WA
From: Navin Ipe <navin.ipe@searchlighthealth. com>
To: [email protected]; Walid Aljoby <[email protected]>
Sent: Thursday, November 24, 2016 12:54 PM
Subject: Re: Storm sending rate
Please remember that we cannot read your mind. A little more elaboration on
what problem you are facing and what you mean by "sending rate" would help.
On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <[email protected]> wrote:
Hi everyone,
Could anyone has an experience to explain the factors affecting sending rate in
Storm?
Thank you--RegardsWA
--
Regards,Navin
--
Regards,Navin