No, I have a config parameter which changes how many random numbers are
generated by the bolt's execute method to simulate a heavier task. The
total number of messages is controlled by another parameter which I keep
to the same value across my experiments.
On 26/07/2015 07:09 μμ, Enno Shioji wrote:
I mean could that be by mistake, you are generating more messages as
you change the config, so the total test time just appears as if there
is no improvement?
On Sun, Jul 26, 2015 at 5:08 PM, Enno Shioji <[email protected]
<mailto:[email protected]>> wrote:
This may be a silly guess, but you are not simply generating
proportionally more messages as you change the config right?
On Sun, Jul 26, 2015 at 4:53 PM, Dimitris Sarlis
<[email protected] <mailto:[email protected]>> wrote:
Kashyap,
I put logger before and after emit in each bolt. In spouts
it's not so easy because I'm using the predefined class
KafkaSpout. See the attached images from a test execution. I
used 1 spout with parallelism 8 and 4 bolts with parallelism
2. I also include a screenshot from a bolt's log where you can
see messages like: "Sending record mpla mpla" and "After
emit". These messages are written before and after each emit
in a bolt.
Dimitris
On 26/07/2015 06:24 μμ, Kashyap Mhaisekar wrote:
Can you put loggers before and after emit () in each bolt/spout?
Can you share Storm UI screenshots ?
Thanks
Kashyap
On Sun, Jul 26, 2015, 10:08 Dimitris Sarlis
<[email protected] <mailto:[email protected]>> wrote:
Hi Harsha,
1. the number of topic partitions is set every time to
the total number
of spouts I'm using.
2. I have checked that data from the kafka producer are
distributed into
all of these partitions
3. I've tried from 4 to 20
4. 1000
5. This topology is just for some testing. Spouts get
data from Kafka
and then dispatch them to bolts. There if the record has
not been
processed before, each bolt generates some random numbers
and then it
selects another bolt to send the record appended with a
"!". If the
record has been processed before (it has a "!" in the
end) then just
generate some random numbers.
6. No
7. No
Dimitris
On 26/07/2015 05:52 μμ, Harsha wrote:
> Hi Dimitris,
>
> 1. how many topic partitions you've
> 2. make sure you are distributing data from kafka
producer side into all
> of these partitions
> 3. whats your kafakspout parallelism set to
> 4. whats you topology.max.spout.pending set to
> 5. if you can , briefly describe what topology is doing.
> 6. are you seeing anything under failed column in Stom UI.
> 7. any errors in storm topology logs.
>
> Thanks,
> Harsha
>
> On Sat, Jul 25, 2015, at 05:29 AM, Dimitris Sarlis wrote:
>> Hi all,
>>
>> I'm trying to run a topology in Storm and I am facing
some scalability
>> issues. Specifically, I have a topology where
KafkaSpouts read from a
>> Kafka queue and emit messages to bolts which are
connected with each
>> other through directGrouping. (Each bolt is connected
with itself as
>> well as with each one of the other bolts). Spouts
subscribe to bolts
>> with shuffleGrouping. I observe that when I increase
the number of
>> spouts and bolts proportionally, I don't get the
speedup I'm expecting
>> to. In fact, my topology seems to run slower and for
the same amount of
>> data, it takes more time to complete. For example,
when I increase
>> spouts from 4->8 and bolts from 4->8, it takes longer
to process the
>> same amount of kafka messages.
>>
>> Any ideas why this is happening? Thanks in advance.
>>
>> Best,
>> Dimitris Sarlis