Based on the throughput that is required to catch up with incoming message rate into the spout/kafka and the Bolt's process latency, we can figure out the parallelism of the bolts.
For example, if the message rate coming in to Kafka topic is X messages/sec (and assume that is the throughput we need for the Storm topology). Assume for a moment that there is only one bolt in pipeline. Let us say the bolt process latency is B ms. The minimum parallelism for the bolt that is required so that the spout does not fall behind is X/(1000/B). With this parallelism we will have the capacity ~ 1 for the bolt. Since there will be more bolts in the pipeline, you would have to have parallelism more than X/(1000/B) and also should make it a multiple of number of workers (increasing workers if required to handle the parallelism). We can use this same logic for all the other remaining bolts. On Mon, Aug 3, 2015 at 4:57 PM, Xunyun Liu <[email protected]> wrote: > Thank you, Derek. I thought that the size of circle would indicate the > proper parallelism hint for this component, but now it appears that is not > the case. > > If so, how would I determine the number of executors and tasks for each > component? I know looking at capacity is a good starting point, but with > only capacity information it feels like that the decision process could be > very time-consuming and cumbersome, which is the reason why I looked into > the topology visualization hoping to get some hints from it. If the > visualization part is a dead end, is there any other indications beside > capacity or general rule of thumb that I can make use of? > > Thank you for your precious time. > > > On 4 August 2015 at 00:33, Derek Dagit <[email protected]> wrote: > >> Actually, I believe the size of the circle is determined by the length of >> the string that is rendered in it, and it is not due to an other property >> or metric of the topology. >> >> >> There is room for improvement to the visualization. >> >> -- >> Derek >> >> >> >> >> ________________________________ >> From: Xunyun Liu <[email protected]> >> To: [email protected] >> Sent: Monday, August 3, 2015 12:05 AM >> Subject: What does the size of circle mean in the Topology Visualization >> >> >> >> Hi there, >> >> I found that the circles in the topology visualization have different >> size, what does that mean exactly? Besides there is a case from the >> visualization showing that the sum of ratios of stream could even be larger >> than 1, is that a normal or just a program bug? >> >> Thank you for your time. >> >> Best Regards >> Xunyun Liu >> > > > > -- > Best Regards. > ====================================================== > Xunyun Liu > The Cloud Computing and Distributed Systems (CLOUDS) Laboratory, > The University of Melbourne >
