Actually that makes a lot of sense. Let me look at that.
On Sun, Jan 19, 2014 at 8:49 PM, Kanak Biscuitwala <[email protected]>wrote: > > This sounds a lot like what we did in AutoRebalanceStrategy. There's an > interface called ReplicaPlacementScheme that the algorithm calls into, and > a DefaultPlacementScheme that just does evenly balanced assignment. > > The simplest thing we could do is have a task rebalancer config and set a > switch for which placement scheme to use. The current task rebalancer > already has to specify things like the DAG, so this could just be another > field to add on. > > > Date: Sun, 19 Jan 2014 13:14:33 -0800 > > Subject: Re: TaskRebalancer > > From: [email protected] > > To: [email protected] > > CC: [email protected]; [email protected] > > > > > Thanks Jason, I was looking at the rebalancer. Looks like target resource > > is mandatory. What do you suggest is the right way to make target > resource > > optional. > > > > This is my understanding of what task rebalancer is doing today. > > > > It assumes that the system is already hosting a resource something like a > > database, index etc. Now one can use the task framework to launch > arbitrary > > tasks on nodes hosting these resources. For example lets say there is a > > database MyDB with 3 partitions and 2 replicas and using Master Slave > state > > model and 3 nodes N1 N2 N3. In a happy state the cluster might look like > > this > > > > { > > "id":"MyDB", > > "mapFields":{ > > "MyDB_0":{ > > "N1":"MASTER", > > "N2":"SLAVE" > > }, > > "MyDB_1":{ > > "N2":"MASTER", > > "N3":"SLAVE" > > }, > > "MyDB_2":{ > > "N1":"SLAVE", > > "N3":"MASTER" > > } > > } > > } > > > > Lets say one wants to take backup of these databases but run only the > > SLAVEs. One can define the back up task and launch 3 back up tasks (one > for > > each partition) only on SLAVEs. > > > > What we have currently works perfectly for this scenario. One has to > simply > > define the target resource and state for the backup tasks and they will > be > > launched in appropriate place. So in this scenario, back task for > > partitions 0,1,2 will be launched at N2, N3, and N1. > > > > But what if the tasks dont have any target resource and can be run on any > > node N1 N2 or N3 and the only requirement is distribute the tasks evenly. > > > > We should decouple the logic of where a task is placed from the logic of > > distributing the tasks. For example, we can abstract out the placement > > constraint from the rebalancer logic. So we can have a placement provider > > that computes placement randomly and one that computes placement based on > > another resource. Probably another one that computes placement based on > > data locality. > > > > What is the right way to approach this ? > > > > thanks, > > Kishore G > > > > > > On Sun, Jan 19, 2014 at 10:12 AM, Zhen Zhang <[email protected]> > wrote: > > > > > TestTaskRebalancer and TestTaskRebalancerStopResume are examples. > > > > > > Thanks, > > > Jason > > > > > > > > > On Sun, Jan 19, 2014 at 9:20 AM, kishore g <[email protected]> > wrote: > > > > > > > Hi, > > > > > > > > I am trying to use TaskRebalancer but not able to understand how it > > > works, > > > > is there any example I can try? > > > > > > > > thanks, > > > > Kishore G > > > > > > > >
