Hi Harsha,
1. the number of topic partitions is set every time to the total number
of spouts I'm using.
2. I have checked that data from the kafka producer are distributed into
all of these partitions
3. I've tried from 4 to 20
4. 1000
5. This topology is just for some testing. Spouts get data from Kafka
and then dispatch them to bolts. There if the record has not been
processed before, each bolt generates some random numbers and then it
selects another bolt to send the record appended with a "!". If the
record has been processed before (it has a "!" in the end) then just
generate some random numbers.
6. No
7. No
Dimitris
On 26/07/2015 05:52 μμ, Harsha wrote:
Hi Dimitris,
1. how many topic partitions you've
2. make sure you are distributing data from kafka producer side into all
of these partitions
3. whats your kafakspout parallelism set to
4. whats you topology.max.spout.pending set to
5. if you can , briefly describe what topology is doing.
6. are you seeing anything under failed column in Stom UI.
7. any errors in storm topology logs.
Thanks,
Harsha
On Sat, Jul 25, 2015, at 05:29 AM, Dimitris Sarlis wrote:
Hi all,
I'm trying to run a topology in Storm and I am facing some scalability
issues. Specifically, I have a topology where KafkaSpouts read from a
Kafka queue and emit messages to bolts which are connected with each
other through directGrouping. (Each bolt is connected with itself as
well as with each one of the other bolts). Spouts subscribe to bolts
with shuffleGrouping. I observe that when I increase the number of
spouts and bolts proportionally, I don't get the speedup I'm expecting
to. In fact, my topology seems to run slower and for the same amount of
data, it takes more time to complete. For example, when I increase
spouts from 4->8 and bolts from 4->8, it takes longer to process the
same amount of kafka messages.
Any ideas why this is happening? Thanks in advance.
Best,
Dimitris Sarlis