You need to rebalance topology with desired no of executors for respective bolts in a topology.
Hope it helps, Satish. On Wed, Jun 15, 2016 at 1:04 PM, Adrien Carreira <[email protected]> wrote: > I think I understood that. > > But, In my example : > > 1 machine on cluster with this basic topology and with 1 worker on conf > > builder.setBolt("fetcher", new > Fetch()).setNumTasks(2).shuffleGrouping("spout"); > > builder.setBolt("extract", new > Extract()).setNumTasks(2).shuffleGrouping("fetcher"); > > builder.setBolt("indexer", new Indexer()) > .setNumTasks(2).shuffleGrouping("extract"); > > Storm will spawn on 1 worker, 3 thread with 6 task. I'm right ? > > Then, If I rebalance to 2 worker, I will have 6 thread for tasks. > > I'm still right ? > > My Problem is : to scale up I understood that I need to set the numTasks > to a bigger value, but It will spawn more task than I want... I only want > One task when I've one machine, two when I've two machine, etc, etc.... > > Hope I'm clear > > > 2016-06-09 16:27 GMT+02:00 Matthias J. Sax <[email protected]>: > >> See here: >> >> >> https://stackoverflow.com/questions/31932573/rebalancing-executors-in-apache-storm/31941796#31941796 >> >> >> https://stackoverflow.com/questions/20371073/how-to-tune-the-parallelism-hint-in-storm >> >> >> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/ >> >> >> -Matthias >> >> >> On 06/09/2016 03:41 PM, Nathan Leung wrote: >> > At that point you have to think about what makes sense for your system >> > right now. For example, maybe it makes sense to have # tasks = 4 times >> > what you need right now, and then reload the topology when you outgrow >> that. >> > >> > Alternatively, you can consider bringing up a larger replacement >> > topology, and then killing the older one. In this case you will have to >> > be more careful with names, and possibly things like resource (worker) >> > allocation. >> > >> > On Thu, Jun 9, 2016 at 9:30 AM, Adrien Carreira <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > So let's say one day I would like to have 100 machine, >> > >> > I should set 100 on setNumTask ? >> > >> > 2016-06-09 15:20 GMT+02:00 Nathan Leung <[email protected] >> > <mailto:[email protected]>>: >> > >> > You can create your topology with more tasks than executors, >> > then when the rebalance happens you can add executors. However >> > at the moment you cannot add more tasks to a running topology. >> > >> > On Thu, Jun 9, 2016 at 8:58 AM, Adrien Carreira >> > <[email protected] <mailto:[email protected]>> wrote: >> > >> > I've just create a topology like this : >> > >> > builder.setBolt("fetcher", new Fetch()) >> > .shuffleGrouping("spout"); >> > >> > builder.setBolt("extract", new Extract()) >> > .shuffleGrouping("fetcher"); >> > >> > builder.setBolt("indexer", new Indexer()) >> > .shuffleGrouping("extract"); >> > >> > >> > Means that I've three bolt with One Worker and >> > parrallelism_hint of 1. >> > >> > Now, Let's say that I've another machine available, or that >> > I've too many tuple to process and I need another machine. >> > >> > >> > I've executed this command : >> > >> > storm rebalance kairos-who -n 2 -e indexer=2 -e fetcher=2 -e >> > extract=2 >> > >> > >> > But what I've is two worker with : >> > >> > worker 1 => Spout + extract >> > >> > worker 2 => fetcher + indexer >> > >> > >> > What I would love : >> > >> > Worker 1 => Spout + fetcher + extract + indexer >> > >> > Worker 2 => Same... >> > >> > >> > I hope I'm clear... >> > >> > >> > >> > >> > >> > >> > >> > 2016-06-09 14:47 GMT+02:00 Andrew Xor >> > <[email protected] >> > <mailto:[email protected]>>: >> > >> > Hello, >> > >> > I am sorry, but I don't know why you cannot emulate >> > those scale up factors by using rebalance; after all it >> > spawns the requested amount of workers (in topology) and >> > executors (in spouts/bolts) only bounded by the >> > topology_max_task_parallelism. Have you read the article >> > in order to understand how parallelism works in storm? >> > >> > Regards. >> > >> > On Thu, Jun 9, 2016 at 3:34 PM, Adrien Carreira >> > <[email protected] <mailto:[email protected]>> >> wrote: >> > >> > Yes, >> > >> > But the rebalance command doesn't do what I would >> like. >> > >> > >> > Let's suppose that I've : >> > >> > SPOUT A (1) => BOLT 1 (1) => BOLT2 (1) => BOLT3 (3) >> > >> > (number is the parallelism hint) >> > It means that If I scale to n worker I would like : >> > >> > SPOUT A (1*n) => BOLT 1 (1*n) => BOLT2 (1*n) => >> > BOLT3 (3*n) >> > >> > >> > But, the storm rebalance keeps the parralisme_hint >> :/ >> > >> > >> > >> > 2016-06-09 14:29 GMT+02:00 Andrew Xor >> > <[email protected] >> > <mailto:[email protected]>>: >> > >> > Hello, >> > >> > Why not use the rebalance command? It's well >> > documented here >> > < >> http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html >> >. >> > >> > Regards. >> > >> > On Thu, Jun 9, 2016 at 3:22 PM, Adrien Carreira >> > <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Hi, >> > >> > After a month building a topology on storm. >> > I've one question about parallelism that I >> > can't answer. >> > >> > I've developed my topology and tested on a >> > cluster with two nodes. >> > >> > My parallelism_hint are ok, everything are >> fine. >> > >> > My question is, if I need to scale the >> > number of worker in the topology to have >> > more worker dooing the same thing how can I >> > achieve that without kill/restart the >> topology >> > >> > Thanks for your reply >> > >> > >> > >> > >> > >> > >> > >> > >> >> >
