You need to rebalance topology with desired no of executors for respective
bolts in a topology.

Hope it helps,
Satish.

On Wed, Jun 15, 2016 at 1:04 PM, Adrien Carreira <[email protected]>
wrote:

> I think I understood that.
>
> But, In my example :
>
> 1 machine on cluster with this basic topology and with 1 worker on conf
>
> builder.setBolt("fetcher", new 
> Fetch()).setNumTasks(2).shuffleGrouping("spout");
>
> builder.setBolt("extract", new 
> Extract()).setNumTasks(2).shuffleGrouping("fetcher");
>
> builder.setBolt("indexer", new Indexer()) 
> .setNumTasks(2).shuffleGrouping("extract");
>
> Storm will spawn on 1 worker, 3 thread with 6 task. I'm right ?
>
> Then, If I rebalance to 2 worker, I will have 6 thread for tasks.
>
> I'm still right ?
>
> My Problem is : to scale up I understood that I need to set the numTasks
> to a bigger value, but It will spawn more task than I want... I only want
> One task when I've one machine, two when I've two machine, etc, etc....
>
> Hope I'm clear
>
>
> 2016-06-09 16:27 GMT+02:00 Matthias J. Sax <[email protected]>:
>
>> See here:
>>
>>
>> https://stackoverflow.com/questions/31932573/rebalancing-executors-in-apache-storm/31941796#31941796
>>
>>
>> https://stackoverflow.com/questions/20371073/how-to-tune-the-parallelism-hint-in-storm
>>
>>
>> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
>>
>>
>> -Matthias
>>
>>
>> On 06/09/2016 03:41 PM, Nathan Leung wrote:
>> > At that point you have to think about what makes sense for your system
>> > right now.  For example, maybe it makes sense to have # tasks = 4 times
>> > what you need right now, and then reload the topology when you outgrow
>> that.
>> >
>> > Alternatively, you can consider bringing up a larger replacement
>> > topology, and then killing the older one.  In this case you will have to
>> > be more careful with names, and possibly things like resource (worker)
>> > allocation.
>> >
>> > On Thu, Jun 9, 2016 at 9:30 AM, Adrien Carreira <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >     So let's say one day I would like to have 100 machine,
>> >
>> >     I should set 100 on setNumTask ?
>> >
>> >     2016-06-09 15:20 GMT+02:00 Nathan Leung <[email protected]
>> >     <mailto:[email protected]>>:
>> >
>> >         You can create your topology with more tasks than executors,
>> >         then when the rebalance happens you can add executors.  However
>> >         at the moment you cannot add more tasks to a running topology.
>> >
>> >         On Thu, Jun 9, 2016 at 8:58 AM, Adrien Carreira
>> >         <[email protected] <mailto:[email protected]>> wrote:
>> >
>> >             I've just create a topology like this :
>> >
>> >             builder.setBolt("fetcher", new Fetch())
>> >             .shuffleGrouping("spout");
>> >
>> >             builder.setBolt("extract", new Extract())
>> >             .shuffleGrouping("fetcher");
>> >
>> >             builder.setBolt("indexer", new Indexer())
>> >             .shuffleGrouping("extract");
>> >
>> >
>> >             Means that I've three bolt with One Worker and
>> >             parrallelism_hint of 1.
>> >
>> >             Now, Let's say that I've another machine available, or that
>> >             I've too many tuple to process and I need another machine.
>> >
>> >
>> >             I've executed this command :
>> >
>> >             storm rebalance kairos-who -n 2 -e indexer=2 -e fetcher=2 -e
>> >             extract=2
>> >
>> >
>> >             But what I've is two worker with :
>> >
>> >             worker 1 => Spout + extract
>> >
>> >             worker 2 => fetcher + indexer
>> >
>> >
>> >             What I would love :
>> >
>> >             Worker 1 => Spout + fetcher + extract + indexer
>> >
>> >             Worker 2 => Same...
>> >
>> >
>> >             I hope I'm clear...
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >             2016-06-09 14:47 GMT+02:00 Andrew Xor
>> >             <[email protected]
>> >             <mailto:[email protected]>>:
>> >
>> >                 Hello,
>> >
>> >                   I am sorry, but I don't know why you cannot emulate
>> >                 those scale up factors by using rebalance; after all it
>> >                 spawns the requested amount of workers (in topology) and
>> >                 executors (in spouts/bolts) only bounded by the
>> >                 topology_max_task_parallelism. Have you read the article
>> >                 in order to understand how parallelism works in storm?
>> >
>> >                 Regards.
>> >
>> >                 On Thu, Jun 9, 2016 at 3:34 PM, Adrien Carreira
>> >                 <[email protected] <mailto:[email protected]>>
>> wrote:
>> >
>> >                     Yes,
>> >
>> >                     But the rebalance command doesn't do what I would
>> like.
>> >
>> >
>> >                     Let's suppose that I've :
>> >
>> >                     SPOUT A (1) => BOLT 1 (1) => BOLT2 (1) => BOLT3 (3)
>> >
>> >                     (number is the parallelism hint)
>> >                     It means that If I scale to n worker I would like :
>> >
>> >                     SPOUT A (1*n) => BOLT 1 (1*n) => BOLT2 (1*n) =>
>> >                     BOLT3 (3*n)
>> >
>> >
>> >                     But, the storm rebalance keeps the parralisme_hint
>> :/
>> >
>> >
>> >
>> >                     2016-06-09 14:29 GMT+02:00 Andrew Xor
>> >                     <[email protected]
>> >                     <mailto:[email protected]>>:
>> >
>> >                         Hello,
>> >
>> >                          Why not use the rebalance command? It's well
>> >                         documented here
>> >                         <
>> http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html
>> >.
>> >
>> >                         Regards.
>> >
>> >                         On Thu, Jun 9, 2016 at 3:22 PM, Adrien Carreira
>> >                         <[email protected]
>> >                         <mailto:[email protected]>> wrote:
>> >
>> >                             Hi,
>> >
>> >                             After a month building a topology on storm.
>> >                             I've one question about parallelism that I
>> >                             can't answer.
>> >
>> >                             I've developed my topology and tested on a
>> >                             cluster with two nodes.
>> >
>> >                             My parallelism_hint are ok, everything are
>> fine.
>> >
>> >                             My question is, if I need to scale the
>> >                             number of worker in the topology to have
>> >                             more worker dooing the same thing how can I
>> >                             achieve that without kill/restart the
>> topology
>> >
>> >                             Thanks for your reply
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>

Reply via email to