That works! The cluster is automatically rebalancing when nodes start/stop. This has raised other questions about rebalancing:
Example output below, and I updated the gist: https://gist.github.com/mkscrg/bcb2ab1dd1b3e84ac93e7ca16e2824f8 - When NODE_0 restarts, why is the resource moved back? This seems like unhelpful churn in the cluster. - Why does the resource stay in the OFFLINE state on NODE_0? 2 node cluster with a single resource with 1 partition/replica, using OnlineOffline: Starting ZooKeeper at localhost:2199 Setting up cluster THE_CLUSTER Starting CONTROLLER Starting NODE_0 Starting NODE_1 Adding resource THE_RESOURCE Rebalancing resource THE_RESOURCE Transition: NODE_0 OFFLINE to ONLINE for THE_RESOURCE Cluster state after setup: NODE_0: ONLINE NODE_1: null ------------------------------------------------------------ Stopping NODE_0 Transition: NODE_1 OFFLINE to ONLINE for THE_RESOURCE Cluster state after stopping first node: NODE_0: null NODE_1: ONLINE ------------------------------------------------------------ Starting NODE_0 Transition: NODE_1 ONLINE to OFFLINE for THE_RESOURCE Transition: NODE_1 OFFLINE to DROPPED for THE_RESOURCE Cluster state after restarting first node: NODE_0: OFFLINE NODE_1: null ------------------------------------------------------------ On Thu, Oct 20, 2016 at 9:18 AM, Lei Xia <[email protected]> wrote: > Hi, Michael > > To answer your questions: > > - Should you have to `rebalance` a resource when adding a new node to > the cluster? > *--- No, if you are using full-auto rebalance mode, yes if you are in > semi-auto rebalance mode. * > - Should you have to `rebalance` when a node is dropped? *-- Again, > same answer, No, you do not need to in full-auto mode. In full-auto mode, > Helix is supposed to detect nodes add/delete/online/offline and rebalance > the resource automatically. * > > > The problem you saw was because your resource was created in SEMI-AUTO > mode instead of FULL-AUTO mode. HelixAdmin.addResource() creates a > resource in semi-auto mode by default if you do not specify a rebalance > mode explicitly. Please see my comments below on how to fix it. > > > static void addResource() throws Exception { > echo("Adding resource " + RESOURCE_NAME); > ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS, > STATE_MODEL_NAME); *==> ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME, > NUM_PARTITIONS, STATE_MODEL_NAME, RebalanceMode.FULL_AUTO); * > echo("Rebalancing resource " + RESOURCE_NAME); > ADMIN.rebalance(CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS); * // This > just needs to be called once after the resource was created, no need to > call when there is node change. * > } > > > Please give it a try and let me know whether it works. Thanks! > > > Lei > > On Wed, Oct 19, 2016 at 11:52 PM, Michael Craig <[email protected]> wrote: > >> Here is some repro code for "drop a node, resource is not redistributed" >> case I described: https://gist.github.com/mkscrg/bcb2ab1dd1b3e84ac9 >> 3e7ca16e2824f8 >> >> Can we answer these 2 questions? That would help clarify things: >> >> - Should you have to `rebalance` a resource when adding a new node to >> the cluster? >> - If no, this is an easy bug to reproduce. The example code >> >> <https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/examples/Quickstart.java#L198> >> calls rebalance after adding a node, and it breaks if you comment out >> that >> line. >> - If yes, what is the correct way to manage many resources on a >> cluster? Iterate through all resources and rebalance them for every new >> node? >> - Should you have to `rebalance` when a node is dropped? >> - If no, there is a bug. See the repro code posted above. >> - If yes, we are in the same rebalance-every-resource situation as >> above. >> >> My use case is to manage a set of ad-hoc tasks across a cluster of >> machines. Each task would be a separate resource with a unique name, with 1 >> partition and 1 replica. Each resource would reside on exactly 1 node, and >> there is no limit on the number of resources per node. >> >> On Wed, Oct 19, 2016 at 9:23 PM, Lei Xia <[email protected]> wrote: >> >>> Hi, Michael >>> >>> Could you be more specific on the issue you see? Specifically: >>> 1) For 1 resource and 2 replicas, you mean the resource has only 1 >>> partition, with replica number equals to 2, right? >>> 2) You see* REBALANCE_MODE="FULL_AUTO"*, not* IDEALSTATE_MODE="AUTO" *in >>> your idealState, right? >>> 3) by dropping N1, you mean disconnect N1 from helix/zookeeper, so N1 >>> is not in liveInstances, right? >>> >>> If your answers to all of above questions are yes, then there may be >>> some bug here. If possible, please paste your idealstate, and your test >>> code (if there is any) here, I will try to reproduce and debug it. Thanks >>> >>> >>> Lei >>> >>> On Wed, Oct 19, 2016 at 9:02 PM, kishore g <[email protected]> wrote: >>> >>>> Can you describe your scenario in detail and the expected behavior?. I >>>> agree calling rebalance on every live instance change is ugly and >>>> definitely not as per the design. It was an oversight (we focussed a lot of >>>> large number of partitions and failed to handle this simple case). >>>> >>>> Please file and jira and we will work on that. Lei, do you think the >>>> recent bug we fixed with AutoRebalancer will handle this case? >>>> >>>> thanks, >>>> Kishore G >>>> >>>> On Wed, Oct 19, 2016 at 8:55 PM, Michael Craig <[email protected]> wrote: >>>> >>>>> Thanks for the quick response Kishore. This issue is definitely tied >>>>> to the condition that partitions * replicas < NODE_COUNT. >>>>> If all running nodes have a "piece" of the resource, then they behave >>>>> well when the LEADER node goes away. >>>>> >>>>> Is it possible to use Helix to manage a set of resources where that >>>>> condition is true? I.e. where the *total *number of >>>>> partitions/replicas in the cluster is greater than the node count, but >>>>> each >>>>> individual resource has a small number of partitions/replicas. >>>>> >>>>> (Calling rebalance on every liveInstance change does not seem like a >>>>> good solution, because you would have to iterate through all resources in >>>>> the cluster and rebalance each individually.) >>>>> >>>>> >>>>> On Wed, Oct 19, 2016 at 12:52 PM, kishore g <[email protected]> >>>>> wrote: >>>>> >>>>>> I think this might be a corner case when partitions * replicas < >>>>>> TOTAL_NUMBER_OF_NODES. Can you try with many partitions and replicas and >>>>>> check if the issue still exists. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Oct 19, 2016 at 11:53 AM, Michael Craig <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I've noticed that partitions/replicas assigned to disconnected >>>>>>> instances are not automatically redistributed to live instances. What's >>>>>>> the >>>>>>> correct way to do this? >>>>>>> >>>>>>> For example, given this setup with Helix 0.6.5: >>>>>>> - 1 resource >>>>>>> - 2 replicas >>>>>>> - LeaderStandby state model >>>>>>> - FULL_AUTO rebalance mode >>>>>>> - 3 nodes (N1 is Leader, N2 is Standby, N3 is just sitting) >>>>>>> >>>>>>> Then drop N1: >>>>>>> - N2 becomes LEADER >>>>>>> - Nothing happens to N3 >>>>>>> >>>>>>> Naively, I would have expected N3 to transition from Offline to >>>>>>> Standby, but that doesn't happen. >>>>>>> >>>>>>> I can force redistribution from >>>>>>> GenericHelixController#onLiveInstanceChange >>>>>>> by >>>>>>> - dropping non-live instances from the cluster >>>>>>> - calling rebalance >>>>>>> >>>>>>> The instance dropping seems pretty unsafe! Is there a better way? >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> Lei Xia >>> >> >> > > > -- > > *Lei Xia *Senior Software Engineer > Data Infra/Nuage & Helix > LinkedIn > > [email protected] > www.linkedin.com/in/lxia1 >
