Re: simple question about grouping

Arun Mahadevan Mon, 23 Jan 2017 03:58:14 -0800

There is no magic number, it depends on the specific problem you are trying to 
solve. You start with some reasonable value for the parallelism and tune it 
based on your requirements. You could also start with a higher number of 
“tasks” than the parallelism and then you can rebalance your topology and 
adjust parallelism on the fly to scale up or down.

See the slides from Taylor’s “Scaling Storm” presentation, you might find it 
useful - 
http://www.slideshare.net/ptgoetz/scaling-apache-storm-strata-hadoopworld-2014

From: sam mohel <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Monday, January 23, 2017 at 4:58 PM
To: "[email protected]" <[email protected]>
Subject: Re: simple question about grouping

Many thanks , but how and when can i decide that this number is perfect form me 
or not ?

On Mon, Jan 23, 2017 at 1:27 PM, Arun Mahadevan <[email protected]> wrote:

> builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i 
> found this example but couldn't know why he use number 4 ? 

This is the “parallelism hint” (the number of threads) for “MyBolt”. So in your 
example there will be 4 threads executing “MyBolt” across the workers in your 
cluster and the tuples from “MySpout” would be randomly distributed across all 
of the 4 instances of your bolt.

Also see 
http://storm.apache.org/releases/1.0.1/Understanding-the-parallelism-of-a-Storm-topology.html

From: sam mohel <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Monday, January 23, 2017 at 4:47 PM
To: "[email protected]" <[email protected]>
Subject: Re: simple question about grouping

excuse me , if i have single spout and single bolt and the bolt doing 2 process 
so can i do like this 
builder.setSpout("MySpout", new mySpout(), 1);
builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i found 
this example but couldn't know why he use number 4 ? 

On Mon, Jan 23, 2017 at 1:13 PM, sam mohel <[email protected]> wrote:

thanks for replying 

On Mon, Jan 23, 2017 at 1:14 PM, Arun Mahadevan <[email protected]> wrote:

Grouping makes sense only when you have more than one task for a bolt. If your 
bolt has more than one task, then the grouping will decide how the tuples from 
the spout are distributed to the individual tasks of the bolt. (shuffe = 
random, fields = keyed on some field and so on). 

See http://storm.apache.org/releases/current/Concepts.html 

Thanks,

Arun

From: sam mohel <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Monday, January 23, 2017 at 3:09 PM
To: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: simple question about grouping

i have text file contains data . size of this file is 3.5 MB . My topology 
consists of one spout and one bolt so is that possible to make all processing 
in one bolt and in this case what is the role of grouping here ? 

Thanks in advance

Re: simple question about grouping

Reply via email to