Re: Storm scalability issue

Dimitris Sarlis Sun, 26 Jul 2015 12:04:57 -0700

No, I have a config parameter which changes how many random numbers aregenerated by the bolt's execute method to simulate a heavier task. Thetotal number of messages is controlled by another parameter which I keepto the same value across my experiments.


On 26/07/2015 07:09 μμ, Enno Shioji wrote:

I mean could that be by mistake, you are generating more messages asyou change the config, so the total test time just appears as if thereis no improvement?

On Sun, Jul 26, 2015 at 5:08 PM, Enno Shioji <[email protected]<mailto:[email protected]>> wrote:


    This may be a silly guess, but you are not simply generating
    proportionally more messages as you change the config right?

    On Sun, Jul 26, 2015 at 4:53 PM, Dimitris Sarlis
    <[email protected] <mailto:[email protected]>> wrote:

        Kashyap,

        I put logger before and after emit in each bolt. In spouts
        it's not so easy because I'm using the predefined class
        KafkaSpout. See the attached images from a test execution. I
        used 1 spout with parallelism 8 and 4 bolts with parallelism
        2. I also include a screenshot from a bolt's log where you can
        see messages like: "Sending record mpla mpla" and "After
        emit". These messages are written before and after each emit
        in a bolt.

        Dimitris


        On 26/07/2015 06:24 μμ, Kashyap Mhaisekar wrote:


        Can you put loggers before and after emit () in each bolt/spout?

        Can you share Storm UI screenshots ?

        Thanks
        Kashyap


        On Sun, Jul 26, 2015, 10:08 Dimitris Sarlis
        <[email protected] <mailto:[email protected]>> wrote:

            Hi Harsha,

            1. the number of topic partitions is set every time to
            the total number
            of spouts I'm using.
            2. I have checked that data from the kafka producer are
            distributed into
            all of these partitions
            3. I've tried from 4 to 20
            4. 1000
            5. This topology is just for some testing. Spouts get
            data from Kafka
            and then dispatch them to bolts. There if the record has
            not been
            processed before, each bolt generates some random numbers
            and then it
            selects another bolt to send the record appended with a
            "!". If the
            record has been processed before (it has a "!" in the
            end) then just
            generate some random numbers.
            6. No
            7. No

            Dimitris

            On 26/07/2015 05:52 μμ, Harsha wrote:
            > Hi Dimitris,
            >
            > 1. how many topic partitions you've
            > 2. make sure you are distributing data from kafka
            producer side into all
            > of these partitions
            > 3. whats your kafakspout parallelism set to
            > 4. whats you topology.max.spout.pending set to
            > 5. if you can , briefly describe what topology is doing.
            > 6. are you seeing anything under failed column in Stom UI.
            > 7. any errors in storm topology logs.
            >
            > Thanks,
            > Harsha
            >
            > On Sat, Jul 25, 2015, at 05:29 AM, Dimitris Sarlis wrote:
            >> Hi all,
            >>
            >> I'm trying to run a topology in Storm and I am facing
            some scalability
            >> issues. Specifically, I have a topology where
            KafkaSpouts read from a
            >> Kafka queue and emit messages to bolts which are
            connected with each
            >> other through directGrouping. (Each bolt is connected
            with itself as
            >> well as with each one of the other bolts). Spouts
            subscribe to bolts
            >> with shuffleGrouping. I observe that when I increase
            the number of
            >> spouts and bolts proportionally, I don't get the
            speedup I'm expecting
            >> to. In fact, my topology seems to run slower and for
            the same amount of
            >> data, it takes more time to complete. For example,
            when I increase
            >> spouts from 4->8 and bolts from 4->8, it takes longer
            to process the
            >> same amount of kafka messages.
            >>
            >> Any ideas why this is happening? Thanks in advance.
            >>
            >> Best,
            >> Dimitris Sarlis

Re: Storm scalability issue

Reply via email to