We should have a recipe for delayed rebalancer On Thu, Mar 1, 2018 at 9:39 AM, Lei Xia <[email protected]> wrote:
> Hi, Utsav > > Sorry to get back to you late. There is one more thing to config, > > idealstate.setMinActiveReplicas(0); > > This tell Helix the minimal replica it needs to maintain, by default is > set to 1, it means Helix needs to maintain at least 1 replica irregardless > of delayed rebalancing. For your case, you want to set it to 0. > > > Lei > > On Mon, Feb 26, 2018 at 11:38 AM, Utsav Kanani <[email protected]> > wrote: > >> Hi Lei, >> >> That did not work >> Seeing the same behavior >> Added the following method to ZKHelixAdmin Class >> >> public void enableClusterDelayMode(String clusterName) { >> ConfigAccessor configAccessor = new ConfigAccessor(_zkClient); >> ClusterConfig clusterConfig = configAccessor.getClusterConfig(clusterName); >> clusterConfig.setDelayRebalaceEnabled(true); >> clusterConfig.setRebalanceDelayTime(100000); >> configAccessor.setClusterConfig(clusterName, clusterConfig); >> } >> >> and calling it in the demo class >> >> HelixAdmin admin = new ZKHelixAdmin(zkAddress); >> admin.addCluster(clusterName, true); >> ---->((ZKHelixAdmin)admin).enableClusterDelayMode(clusterName); >> StateModelConfigGenerator generator = new StateModelConfigGenerator(); >> admin.addStateModelDef(clusterName, "OnlineOffline", >> new StateModelDefinition(generator.generateConfigForOnlineOffline())); >> >> admin.addResource(clusterName, lockGroupName, numPartitions, "OnlineOffline", >> RebalanceMode.FULL_AUTO.toString()); >> admin.rebalance(clusterName, lockGroupName, 1); >> >> >> >> >> >> STARTING localhost_12000 >> STARTING localhost_12001 >> STARTING localhost_12002 >> STARTED localhost_12000 >> STARTED localhost_12002 >> STARTED localhost_12001 >> localhost_12000 acquired lock:lock-group_0 >> localhost_12002 acquired lock:lock-group_3 >> localhost_12002 acquired lock:lock-group_9 >> localhost_12001 acquired lock:lock-group_2 >> localhost_12001 acquired lock:lock-group_5 >> localhost_12000 acquired lock:lock-group_11 >> localhost_12002 acquired lock:lock-group_6 >> localhost_12000 acquired lock:lock-group_7 >> localhost_12002 acquired lock:lock-group_10 >> localhost_12001 acquired lock:lock-group_8 >> localhost_12001 acquired lock:lock-group_1 >> localhost_12000 acquired lock:lock-group_4 >> lockName acquired By >> ====================================== >> lock-group_0 localhost_12000 >> lock-group_1 localhost_12001 >> lock-group_10 localhost_12002 >> lock-group_11 localhost_12000 >> lock-group_2 localhost_12001 >> lock-group_3 localhost_12002 >> lock-group_4 localhost_12000 >> lock-group_5 localhost_12001 >> lock-group_6 localhost_12002 >> lock-group_7 localhost_12000 >> lock-group_8 localhost_12001 >> lock-group_9 localhost_12002 >> Stopping localhost_12000 >> localhost_12000Interrupted >> localhost_12001 acquired lock:lock-group_11 >> localhost_12001 acquired lock:lock-group_0 >> localhost_12002 acquired lock:lock-group_7 >> localhost_12002 acquired lock:lock-group_4 >> lockName acquired By >> ====================================== >> lock-group_0 localhost_12001 >> lock-group_1 localhost_12001 >> lock-group_10 localhost_12002 >> lock-group_11 localhost_12001 >> lock-group_2 localhost_12001 >> lock-group_3 localhost_12002 >> lock-group_4 localhost_12002 >> lock-group_5 localhost_12001 >> lock-group_6 localhost_12002 >> lock-group_7 localhost_12002 >> lock-group_8 localhost_12001 >> lock-group_9 localhost_12002 >> ===Starting localhost_12000 >> STARTING localhost_12000 >> localhost_12000 acquired lock:lock-group_11 >> localhost_12000 acquired lock:lock-group_0 >> STARTED localhost_12000 >> localhost_12000 acquired lock:lock-group_7 >> localhost_12000 acquired lock:lock-group_4 >> localhost_12001 releasing lock:lock-group_11 >> localhost_12001 releasing lock:lock-group_0 >> localhost_12002 releasing lock:lock-group_7 >> localhost_12002 releasing lock:lock-group_4 >> lockName acquired By >> ====================================== >> lock-group_0 localhost_12000 >> lock-group_1 localhost_12001 >> lock-group_10 localhost_12002 >> lock-group_11 localhost_12000 >> lock-group_2 localhost_12001 >> lock-group_3 localhost_12002 >> lock-group_4 localhost_12000 >> lock-group_5 localhost_12001 >> lock-group_6 localhost_12002 >> lock-group_7 localhost_12000 >> lock-group_8 localhost_12001 >> lock-group_9 localhost_12002 >> >> >> On Sat, Feb 24, 2018 at 8:26 PM, Lei Xia <[email protected]> wrote: >> >>> Hi, Utsav >>> >>> Delay rebalancer by default is disabled in cluster level (this is to >>> keep back-compatible somehow), you need to enable it in the clusterConfig, >>> e.g >>> >>> ConfigAccessor configAccessor = new ConfigAccessor(zkClient); >>> ClusterConfig clusterConfig = configAccessor.getClusterConfig(clusterName); >>> clusterConfig.setDelayRebalaceEnabled(enabled); >>> configAccessor.setClusterConfig(clusterName, clusterConfig); >>> >>> >>> Could you please have a try and let me know whether it works or not? >>> Thanks >>> >>> >>> Lei >>> >>> >>> On Fri, Feb 23, 2018 at 2:33 PM, Utsav Kanani <[email protected]> >>> wrote: >>> >>>> I am trying to expand the Lockmanager example http://helix.apache.or >>>> g/0.6.2-incubating-docs/recipes/lock_manager.html to introduce delay >>>> >>>> tried doing something like this >>>> IdealState state = admin.getResourceIdealState(clusterName, >>>> lockGroupName); >>>> state.setRebalanceDelay(100000); >>>> state.setDelayRebalanceEnabled(true); >>>> state.setRebalancerClassName(DelayedAutoRebalancer.class.ge >>>> tName()); >>>> admin.setResourceIdealState(clusterName, lockGroupName, state); >>>> admin.rebalance(clusterName, lockGroupName, 1); >>>> >>>> On killing a node there is immediate rebalancing which takes place. I >>>> was hoping for a delay of 100 seconds before rebalancing but i am not >>>> seeing that behavior >>>> >>>> >>>> On Stopping localhost_12000 the locks are acquired immediately by >>>> localhost_12001 and localhost_12002 >>>> >>>> on STARTING localhost_12000 the rebalance is again immediate. >>>> >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_4 >>>> STARTED localhost_12000 >>>> localhost_12001 releasing lock:lock-group_0 >>>> localhost_12001 releasing lock:lock-group_11 >>>> localhost_12002 releasing lock:lock-group_4 >>>> localhost_12002 releasing lock:lock-group_7 >>>> >>>> >>>> Here is the output >>>> ========================================= >>>> >>>> STARTING localhost_12000 >>>> STARTING localhost_12001 >>>> STARTING localhost_12002 >>>> STARTED localhost_12001 >>>> STARTED localhost_12002 >>>> STARTED localhost_12000 >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12002 acquired lock:lock-group_10 >>>> localhost_12002 acquired lock:lock-group_9 >>>> localhost_12002 acquired lock:lock-group_3 >>>> localhost_12001 acquired lock:lock-group_2 >>>> localhost_12001 acquired lock:lock-group_1 >>>> localhost_12001 acquired lock:lock-group_8 >>>> localhost_12002 acquired lock:lock-group_6 >>>> localhost_12000 acquired lock:lock-group_4 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12001 acquired lock:lock-group_5 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12000 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12000 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12000 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12000 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> Stopping localhost_12000 >>>> localhost_12000Interrupted >>>> localhost_12002 acquired lock:lock-group_4 >>>> localhost_12001 acquired lock:lock-group_11 >>>> localhost_12002 acquired lock:lock-group_7 >>>> localhost_12001 acquired lock:lock-group_0 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12001 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12001 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12002 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12002 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> ===Starting localhost_12000 >>>> STARTING localhost_12000 >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_4 >>>> STARTED localhost_12000 >>>> localhost_12001 releasing lock:lock-group_0 >>>> localhost_12001 releasing lock:lock-group_11 >>>> localhost_12002 releasing lock:lock-group_4 >>>> localhost_12002 releasing lock:lock-group_7 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12000 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12000 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12000 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12000 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> >>> >>> >>> >>> -- >>> Lei Xia >>> >> >> >> >> On Sat, Feb 24, 2018 at 8:26 PM, Lei Xia <[email protected]> wrote: >> >>> Hi, Utsav >>> >>> Delay rebalancer by default is disabled in cluster level (this is to >>> keep back-compatible somehow), you need to enable it in the clusterConfig, >>> e.g >>> >>> ConfigAccessor configAccessor = new ConfigAccessor(zkClient); >>> ClusterConfig clusterConfig = configAccessor.getClusterConfig(clusterName); >>> clusterConfig.setDelayRebalaceEnabled(enabled); >>> configAccessor.setClusterConfig(clusterName, clusterConfig); >>> >>> >>> Could you please have a try and let me know whether it works or not? >>> Thanks >>> >>> >>> Lei >>> >>> >>> On Fri, Feb 23, 2018 at 2:33 PM, Utsav Kanani <[email protected]> >>> wrote: >>> >>>> I am trying to expand the Lockmanager example >>>> http://helix.apache.org/0.6.2-incubating-docs/recipes/lock_manager.html >>>> to introduce delay >>>> >>>> tried doing something like this >>>> IdealState state = admin.getResourceIdealState(clusterName, >>>> lockGroupName); >>>> state.setRebalanceDelay(100000); >>>> state.setDelayRebalanceEnabled(true); >>>> state.setRebalancerClassName(DelayedAutoRebalancer.class.ge >>>> tName()); >>>> admin.setResourceIdealState(clusterName, lockGroupName, state); >>>> admin.rebalance(clusterName, lockGroupName, 1); >>>> >>>> On killing a node there is immediate rebalancing which takes place. I >>>> was hoping for a delay of 100 seconds before rebalancing but i am not >>>> seeing that behavior >>>> >>>> >>>> On Stopping localhost_12000 the locks are acquired immediately by >>>> localhost_12001 and localhost_12002 >>>> >>>> on STARTING localhost_12000 the rebalance is again immediate. >>>> >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_4 >>>> STARTED localhost_12000 >>>> localhost_12001 releasing lock:lock-group_0 >>>> localhost_12001 releasing lock:lock-group_11 >>>> localhost_12002 releasing lock:lock-group_4 >>>> localhost_12002 releasing lock:lock-group_7 >>>> >>>> >>>> Here is the output >>>> ========================================= >>>> >>>> STARTING localhost_12000 >>>> STARTING localhost_12001 >>>> STARTING localhost_12002 >>>> STARTED localhost_12001 >>>> STARTED localhost_12002 >>>> STARTED localhost_12000 >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12002 acquired lock:lock-group_10 >>>> localhost_12002 acquired lock:lock-group_9 >>>> localhost_12002 acquired lock:lock-group_3 >>>> localhost_12001 acquired lock:lock-group_2 >>>> localhost_12001 acquired lock:lock-group_1 >>>> localhost_12001 acquired lock:lock-group_8 >>>> localhost_12002 acquired lock:lock-group_6 >>>> localhost_12000 acquired lock:lock-group_4 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12001 acquired lock:lock-group_5 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12000 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12000 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12000 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12000 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> Stopping localhost_12000 >>>> localhost_12000Interrupted >>>> localhost_12002 acquired lock:lock-group_4 >>>> localhost_12001 acquired lock:lock-group_11 >>>> localhost_12002 acquired lock:lock-group_7 >>>> localhost_12001 acquired lock:lock-group_0 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12001 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12001 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12002 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12002 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> ===Starting localhost_12000 >>>> STARTING localhost_12000 >>>> localhost_12000 acquired lock:lock-group_11 >>>> localhost_12000 acquired lock:lock-group_7 >>>> localhost_12000 acquired lock:lock-group_0 >>>> localhost_12000 acquired lock:lock-group_4 >>>> STARTED localhost_12000 >>>> localhost_12001 releasing lock:lock-group_0 >>>> localhost_12001 releasing lock:lock-group_11 >>>> localhost_12002 releasing lock:lock-group_4 >>>> localhost_12002 releasing lock:lock-group_7 >>>> lockName acquired By >>>> ====================================== >>>> lock-group_0 localhost_12000 >>>> lock-group_1 localhost_12001 >>>> lock-group_10 localhost_12002 >>>> lock-group_11 localhost_12000 >>>> lock-group_2 localhost_12001 >>>> lock-group_3 localhost_12002 >>>> lock-group_4 localhost_12000 >>>> lock-group_5 localhost_12001 >>>> lock-group_6 localhost_12002 >>>> lock-group_7 localhost_12000 >>>> lock-group_8 localhost_12001 >>>> lock-group_9 localhost_12002 >>>> >>> >>> >>> >>> -- >>> Lei Xia >>> >> >> > > > -- > Lei Xia >
