Re: Storm scalability issue

Dimitris Sarlis Sat, 25 Jul 2015 08:49:12 -0700

Morgan,

I hardly think that this could be the problem. The topology is deployedover a 14 node cluster with 56 total cores and 96GB RAM. So when I jumpfrom 8 to 16 workers, I think I am still far below my hardware limitations.


On 25/07/2015 06:45 μμ, Morgan W09 wrote:

It could be possible that you're reaching a hardware limitation. Thejump from 8 to 16 total bolt/workers could be more than you hardwarecan handle efficiently. So it's starting to have to switch outprocesses and their memory, which can have substantial overheadcausing your program to slow down.

On Sat, Jul 25, 2015 at 10:36 AM, Dimitris Sarlis<[email protected] <mailto:[email protected]>> wrote:


    Yes, it listens to its own output. For example, if I have two
    bolts (bolt1 and bolt2), I perform the following:

    bolt1.directGrouping("bolt1");
    bolt1.directGrouping("bolt2");
    bolt2.directGrouping("bolt1");
    bolt2.directGrouping("bolt2");

    I know that this could possibly lead to a cycle, but right now the
    bolts I'm trying to run perform the following:
    if the inputRecord doesn't contain a "!" {
        append a "!"
        emit to a random node
    }
    else {
        do nothing with the record
    }

    Dimitris


    On 25/07/2015 06:03 μμ, Enno Shioji wrote:

    > Each bolt is connected with itself as well as with each one of
    the other bolts
    You mean the bolt listens to its own output?





    On Sat, Jul 25, 2015 at 1:29 PM, Dimitris Sarlis
    <[email protected] <mailto:[email protected]>> wrote:

        Hi all,

        I'm trying to run a topology in Storm and I am facing some
        scalability issues. Specifically, I have a topology where
        KafkaSpouts read from a Kafka queue and emit messages to
        bolts which are connected with each other through
        directGrouping. (Each bolt is connected with itself as well
        as with each one of the other bolts). Spouts subscribe to
        bolts with shuffleGrouping. I observe that when I increase
        the number of spouts and bolts proportionally, I don't get
        the speedup I'm expecting to. In fact, my topology seems to
        run slower and for the same amount of data, it takes more
        time to complete. For example, when I increase spouts from
        4->8 and bolts from 4->8, it takes longer to process the same
        amount of kafka messages.

        Any ideas why this is happening? Thanks in advance.

        Best,
        Dimitris Sarlis

Re: Storm scalability issue

Reply via email to