What does the emit look like with the direct grouping? Are you changing the number of tasks you emit to? On Jul 26, 2015 6:39 PM, "Dimitris Sarlis" <[email protected]> wrote:
> Kashyap, > > I put logger before and after emit in each bolt. In spouts it's not so > easy because I'm using the predefined class KafkaSpout. See the attached > images from a test execution. I used 1 spout with parallelism 8 and 4 bolts > with parallelism 2. I also include a screenshot from a bolt's log where you > can see messages like: "Sending record mpla mpla" and "After emit". These > messages are written before and after each emit in a bolt. > > Dimitris > > On 26/07/2015 06:24 μμ, Kashyap Mhaisekar wrote: > > Can you put loggers before and after emit () in each bolt/spout? > > Can you share Storm UI screenshots ? > > Thanks > Kashyap > > On Sun, Jul 26, 2015, 10:08 Dimitris Sarlis <[email protected]> wrote: > >> Hi Harsha, >> >> 1. the number of topic partitions is set every time to the total number >> of spouts I'm using. >> 2. I have checked that data from the kafka producer are distributed into >> all of these partitions >> 3. I've tried from 4 to 20 >> 4. 1000 >> 5. This topology is just for some testing. Spouts get data from Kafka >> and then dispatch them to bolts. There if the record has not been >> processed before, each bolt generates some random numbers and then it >> selects another bolt to send the record appended with a "!". If the >> record has been processed before (it has a "!" in the end) then just >> generate some random numbers. >> 6. No >> 7. No >> >> Dimitris >> >> On 26/07/2015 05:52 μμ, Harsha wrote: >> > Hi Dimitris, >> > >> > 1. how many topic partitions you've >> > 2. make sure you are distributing data from kafka producer side into all >> > of these partitions >> > 3. whats your kafakspout parallelism set to >> > 4. whats you topology.max.spout.pending set to >> > 5. if you can , briefly describe what topology is doing. >> > 6. are you seeing anything under failed column in Stom UI. >> > 7. any errors in storm topology logs. >> > >> > Thanks, >> > Harsha >> > >> > On Sat, Jul 25, 2015, at 05:29 AM, Dimitris Sarlis wrote: >> >> Hi all, >> >> >> >> I'm trying to run a topology in Storm and I am facing some scalability >> >> issues. Specifically, I have a topology where KafkaSpouts read from a >> >> Kafka queue and emit messages to bolts which are connected with each >> >> other through directGrouping. (Each bolt is connected with itself as >> >> well as with each one of the other bolts). Spouts subscribe to bolts >> >> with shuffleGrouping. I observe that when I increase the number of >> >> spouts and bolts proportionally, I don't get the speedup I'm expecting >> >> to. In fact, my topology seems to run slower and for the same amount of >> >> data, it takes more time to complete. For example, when I increase >> >> spouts from 4->8 and bolts from 4->8, it takes longer to process the >> >> same amount of kafka messages. >> >> >> >> Any ideas why this is happening? Thanks in advance. >> >> >> >> Best, >> >> Dimitris Sarlis >> >> >
