Moving this 2nd rebalancing question to another thread to clarify. Thanks Kishore and Lei for your help!
On Thu, Oct 20, 2016 at 10:28 AM, Michael Craig <[email protected]> wrote: > That works! The cluster is automatically rebalancing when nodes > start/stop. This has raised other questions about rebalancing: > > Example output below, and I updated the gist: https://gist.github.com/ > mkscrg/bcb2ab1dd1b3e84ac93e7ca16e2824f8 > > - When NODE_0 restarts, why is the resource moved back? This seems > like unhelpful churn in the cluster. > - Why does the resource stay in the OFFLINE state on NODE_0? > > > 2 node cluster with a single resource with 1 partition/replica, using > OnlineOffline: > > Starting ZooKeeper at localhost:2199 > Setting up cluster THE_CLUSTER > Starting CONTROLLER > Starting NODE_0 > Starting NODE_1 > Adding resource THE_RESOURCE > Rebalancing resource THE_RESOURCE > Transition: NODE_0 OFFLINE to ONLINE for THE_RESOURCE > Cluster state after setup: > NODE_0: ONLINE > NODE_1: null > ------------------------------------------------------------ > Stopping NODE_0 > Transition: NODE_1 OFFLINE to ONLINE for THE_RESOURCE > Cluster state after stopping first node: > NODE_0: null > NODE_1: ONLINE > ------------------------------------------------------------ > Starting NODE_0 > Transition: NODE_1 ONLINE to OFFLINE for THE_RESOURCE > Transition: NODE_1 OFFLINE to DROPPED for THE_RESOURCE > Cluster state after restarting first node: > NODE_0: OFFLINE > NODE_1: null > ------------------------------------------------------------ > > On Thu, Oct 20, 2016 at 9:18 AM, Lei Xia <[email protected]> wrote: > >> Hi, Michael >> >> To answer your questions: >> >> - Should you have to `rebalance` a resource when adding a new node to >> the cluster? >> *--- No, if you are using full-auto rebalance mode, yes if you are in >> semi-auto rebalance mode. * >> - Should you have to `rebalance` when a node is dropped? *-- Again, >> same answer, No, you do not need to in full-auto mode. In full-auto mode, >> Helix is supposed to detect nodes add/delete/online/offline and rebalance >> the resource automatically. * >> >> >> The problem you saw was because your resource was created in SEMI-AUTO >> mode instead of FULL-AUTO mode. HelixAdmin.addResource() creates a >> resource in semi-auto mode by default if you do not specify a rebalance >> mode explicitly. Please see my comments below on how to fix it. >> >> >> static void addResource() throws Exception { >> echo("Adding resource " + RESOURCE_NAME); >> ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS, >> STATE_MODEL_NAME); *==> ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME, >> NUM_PARTITIONS, STATE_MODEL_NAME, RebalanceMode.FULL_AUTO); * >> echo("Rebalancing resource " + RESOURCE_NAME); >> ADMIN.rebalance(CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS); * // This >> just needs to be called once after the resource was created, no need to >> call when there is node change. * >> } >> >> >> Please give it a try and let me know whether it works. Thanks! >> >> >> Lei >> >> On Wed, Oct 19, 2016 at 11:52 PM, Michael Craig <[email protected]> wrote: >> >>> Here is some repro code for "drop a node, resource is not redistributed" >>> case I described: https://gist.github.com/mkscrg/bcb2ab1dd1b3e84ac9 >>> 3e7ca16e2824f8 >>> >>> Can we answer these 2 questions? That would help clarify things: >>> >>> - Should you have to `rebalance` a resource when adding a new node >>> to the cluster? >>> - If no, this is an easy bug to reproduce. The example code >>> >>> <https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/examples/Quickstart.java#L198> >>> calls rebalance after adding a node, and it breaks if you comment out >>> that >>> line. >>> - If yes, what is the correct way to manage many resources on a >>> cluster? Iterate through all resources and rebalance them for every >>> new >>> node? >>> - Should you have to `rebalance` when a node is dropped? >>> - If no, there is a bug. See the repro code posted above. >>> - If yes, we are in the same rebalance-every-resource situation >>> as above. >>> >>> My use case is to manage a set of ad-hoc tasks across a cluster of >>> machines. Each task would be a separate resource with a unique name, with 1 >>> partition and 1 replica. Each resource would reside on exactly 1 node, and >>> there is no limit on the number of resources per node. >>> >>> On Wed, Oct 19, 2016 at 9:23 PM, Lei Xia <[email protected]> wrote: >>> >>>> Hi, Michael >>>> >>>> Could you be more specific on the issue you see? Specifically: >>>> 1) For 1 resource and 2 replicas, you mean the resource has only 1 >>>> partition, with replica number equals to 2, right? >>>> 2) You see* REBALANCE_MODE="FULL_AUTO"*, not* IDEALSTATE_MODE="AUTO" >>>> *in your idealState, right? >>>> 3) by dropping N1, you mean disconnect N1 from helix/zookeeper, so N1 >>>> is not in liveInstances, right? >>>> >>>> If your answers to all of above questions are yes, then there may be >>>> some bug here. If possible, please paste your idealstate, and your test >>>> code (if there is any) here, I will try to reproduce and debug it. Thanks >>>> >>>> >>>> Lei >>>> >>>> On Wed, Oct 19, 2016 at 9:02 PM, kishore g <[email protected]> wrote: >>>> >>>>> Can you describe your scenario in detail and the expected behavior?. I >>>>> agree calling rebalance on every live instance change is ugly and >>>>> definitely not as per the design. It was an oversight (we focussed a lot >>>>> of >>>>> large number of partitions and failed to handle this simple case). >>>>> >>>>> Please file and jira and we will work on that. Lei, do you think the >>>>> recent bug we fixed with AutoRebalancer will handle this case? >>>>> >>>>> thanks, >>>>> Kishore G >>>>> >>>>> On Wed, Oct 19, 2016 at 8:55 PM, Michael Craig <[email protected]> wrote: >>>>> >>>>>> Thanks for the quick response Kishore. This issue is definitely tied >>>>>> to the condition that partitions * replicas < NODE_COUNT. >>>>>> If all running nodes have a "piece" of the resource, then they behave >>>>>> well when the LEADER node goes away. >>>>>> >>>>>> Is it possible to use Helix to manage a set of resources where that >>>>>> condition is true? I.e. where the *total *number of >>>>>> partitions/replicas in the cluster is greater than the node count, but >>>>>> each >>>>>> individual resource has a small number of partitions/replicas. >>>>>> >>>>>> (Calling rebalance on every liveInstance change does not seem like a >>>>>> good solution, because you would have to iterate through all resources in >>>>>> the cluster and rebalance each individually.) >>>>>> >>>>>> >>>>>> On Wed, Oct 19, 2016 at 12:52 PM, kishore g <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I think this might be a corner case when partitions * replicas < >>>>>>> TOTAL_NUMBER_OF_NODES. Can you try with many partitions and replicas and >>>>>>> check if the issue still exists. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Oct 19, 2016 at 11:53 AM, Michael Craig <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I've noticed that partitions/replicas assigned to disconnected >>>>>>>> instances are not automatically redistributed to live instances. >>>>>>>> What's the >>>>>>>> correct way to do this? >>>>>>>> >>>>>>>> For example, given this setup with Helix 0.6.5: >>>>>>>> - 1 resource >>>>>>>> - 2 replicas >>>>>>>> - LeaderStandby state model >>>>>>>> - FULL_AUTO rebalance mode >>>>>>>> - 3 nodes (N1 is Leader, N2 is Standby, N3 is just sitting) >>>>>>>> >>>>>>>> Then drop N1: >>>>>>>> - N2 becomes LEADER >>>>>>>> - Nothing happens to N3 >>>>>>>> >>>>>>>> Naively, I would have expected N3 to transition from Offline to >>>>>>>> Standby, but that doesn't happen. >>>>>>>> >>>>>>>> I can force redistribution from >>>>>>>> GenericHelixController#onLiveInstanceChange >>>>>>>> by >>>>>>>> - dropping non-live instances from the cluster >>>>>>>> - calling rebalance >>>>>>>> >>>>>>>> The instance dropping seems pretty unsafe! Is there a better way? >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Lei Xia >>>> >>> >>> >> >> >> -- >> >> *Lei Xia *Senior Software Engineer >> Data Infra/Nuage & Helix >> LinkedIn >> >> [email protected] >> www.linkedin.com/in/lxia1 >> > >
