Re: Correct way to redistribute work from disconnected instances?

Michael Craig Thu, 20 Oct 2016 10:28:39 -0700

That works! The cluster is automatically rebalancing when nodes start/stop.
This has raised other questions about rebalancing:


Example output below, and I updated the gist:
https://gist.github.com/mkscrg/bcb2ab1dd1b3e84ac93e7ca16e2824f8

   - When NODE_0 restarts, why is the resource moved back? This seems like
   unhelpful churn in the cluster.
   - Why does the resource stay in the OFFLINE state on NODE_0?


2 node cluster with a single resource with 1 partition/replica, using
OnlineOffline:

Starting ZooKeeper at localhost:2199
Setting up cluster THE_CLUSTER
Starting CONTROLLER
Starting NODE_0
Starting NODE_1
Adding resource THE_RESOURCE
Rebalancing resource THE_RESOURCE
Transition: NODE_0 OFFLINE to ONLINE for THE_RESOURCE
Cluster state after setup:
NODE_0: ONLINE
NODE_1: null
------------------------------------------------------------
Stopping NODE_0
Transition: NODE_1 OFFLINE to ONLINE for THE_RESOURCE
Cluster state after stopping first node:
NODE_0: null
NODE_1: ONLINE
------------------------------------------------------------
Starting NODE_0
Transition: NODE_1 ONLINE to OFFLINE for THE_RESOURCE
Transition: NODE_1 OFFLINE to DROPPED for THE_RESOURCE
Cluster state after restarting first node:
NODE_0: OFFLINE
NODE_1: null
------------------------------------------------------------

On Thu, Oct 20, 2016 at 9:18 AM, Lei Xia <[email protected]> wrote:

> Hi, Michael
>
>   To answer your questions:
>
>    - Should you have to `rebalance` a resource when adding a new node to
>    the cluster?
> *--- No, if you are using full-auto rebalance mode,  yes if you are in
>    semi-auto rebalance mode. *
>    - Should you have to `rebalance` when a node is dropped? *-- Again,
>    same answer, No, you do not need to in full-auto mode.  In full-auto mode,
>    Helix is supposed to detect nodes add/delete/online/offline and rebalance
>    the resource automatically. *
>
>
>   The problem you saw was because your resource was created in SEMI-AUTO
> mode instead of FULL-AUTO mode.  HelixAdmin.addResource() creates a
> resource in semi-auto mode by default if you do not specify a rebalance
> mode explicitly.  Please see my comments below on how to fix it.
>
>
> static void addResource() throws Exception {
>   echo("Adding resource " + RESOURCE_NAME);
>   ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS,
> STATE_MODEL_NAME);  *==> ADMIN.addResource(CLUSTER_NAME, RESOURCE_NAME,
> NUM_PARTITIONS, STATE_MODEL_NAME, RebalanceMode.FULL_AUTO); *
>   echo("Rebalancing resource " + RESOURCE_NAME);
>   ADMIN.rebalance(CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS);  * // This
> just needs to be called once after the resource was created, no need to
> call when there is node change. *
> }
>
>
> Please give it a try and let me know whether it works.  Thanks!
>
>
> Lei
>
> On Wed, Oct 19, 2016 at 11:52 PM, Michael Craig <[email protected]> wrote:
>
>> Here is some repro code for "drop a node, resource is not redistributed"
>> case I described: https://gist.github.com/mkscrg/bcb2ab1dd1b3e84ac9
>> 3e7ca16e2824f8
>>
>> Can we answer these 2 questions? That would help clarify things:
>>
>>    - Should you have to `rebalance` a resource when adding a new node to
>>    the cluster?
>>    - If no, this is an easy bug to reproduce. The example code
>>       
>> <https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/examples/Quickstart.java#L198>
>>       calls rebalance after adding a node, and it breaks if you comment out 
>> that
>>       line.
>>       - If yes, what is the correct way to manage many resources on a
>>       cluster? Iterate through all resources and rebalance them for every new
>>       node?
>>    - Should you have to `rebalance` when a node is dropped?
>>       - If no, there is a bug. See the repro code posted above.
>>       - If yes, we are in the same rebalance-every-resource situation as
>>       above.
>>
>> My use case is to manage a set of ad-hoc tasks across a cluster of
>> machines. Each task would be a separate resource with a unique name, with 1
>> partition and 1 replica. Each resource would reside on exactly 1 node, and
>> there is no limit on the number of resources per node.
>>
>> On Wed, Oct 19, 2016 at 9:23 PM, Lei Xia <[email protected]> wrote:
>>
>>> Hi, Michael
>>>
>>>   Could you be more specific on the issue you see? Specifically:
>>>   1) For 1 resource and 2 replicas, you mean the resource has only 1
>>> partition, with replica number equals to 2, right?
>>>   2) You see* REBALANCE_MODE="FULL_AUTO"*, not* IDEALSTATE_MODE="AUTO" *in
>>> your idealState, right?
>>>   3) by dropping N1, you mean disconnect N1 from helix/zookeeper, so N1
>>> is not in liveInstances, right?
>>>
>>>   If your answers to all of above questions are yes, then there may be
>>> some bug here.  If possible, please paste your idealstate, and your test
>>> code (if there is any) here, I will try to reproduce and debug it.  Thanks
>>>
>>>
>>> Lei
>>>
>>> On Wed, Oct 19, 2016 at 9:02 PM, kishore g <[email protected]> wrote:
>>>
>>>> Can you describe your scenario in detail and the expected behavior?. I
>>>> agree calling rebalance on every live instance change is ugly and
>>>> definitely not as per the design. It was an oversight (we focussed a lot of
>>>> large number of partitions and failed to handle this simple case).
>>>>
>>>> Please file and jira and we will work on that. Lei, do you think the
>>>> recent bug we fixed with AutoRebalancer will handle this case?
>>>>
>>>> thanks,
>>>> Kishore G
>>>>
>>>> On Wed, Oct 19, 2016 at 8:55 PM, Michael Craig <[email protected]> wrote:
>>>>
>>>>> Thanks for the quick response Kishore. This issue is definitely tied
>>>>> to the condition that partitions * replicas < NODE_COUNT.
>>>>> If all running nodes have a "piece" of the resource, then they behave
>>>>> well when the LEADER node goes away.
>>>>>
>>>>> Is it possible to use Helix to manage a set of resources where that
>>>>> condition is true? I.e. where the *total *number of
>>>>> partitions/replicas in the cluster is greater than the node count, but 
>>>>> each
>>>>> individual resource has a small number of partitions/replicas.
>>>>>
>>>>> (Calling rebalance on every liveInstance change does not seem like a
>>>>> good solution, because you would have to iterate through all resources in
>>>>> the cluster and rebalance each individually.)
>>>>>
>>>>>
>>>>> On Wed, Oct 19, 2016 at 12:52 PM, kishore g <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I think this might be a corner case when partitions * replicas <
>>>>>> TOTAL_NUMBER_OF_NODES. Can you try with many partitions and replicas and
>>>>>> check if the issue still exists.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 19, 2016 at 11:53 AM, Michael Craig <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> I've noticed that partitions/replicas assigned to disconnected
>>>>>>> instances are not automatically redistributed to live instances. What's 
>>>>>>> the
>>>>>>> correct way to do this?
>>>>>>>
>>>>>>> For example, given this setup with Helix 0.6.5:
>>>>>>> - 1 resource
>>>>>>> - 2 replicas
>>>>>>> - LeaderStandby state model
>>>>>>> - FULL_AUTO rebalance mode
>>>>>>> - 3 nodes (N1 is Leader, N2 is Standby, N3 is just sitting)
>>>>>>>
>>>>>>> Then drop N1:
>>>>>>> - N2 becomes LEADER
>>>>>>> - Nothing happens to N3
>>>>>>>
>>>>>>> Naively, I would have expected N3 to transition from Offline to
>>>>>>> Standby, but that doesn't happen.
>>>>>>>
>>>>>>> I can force redistribution from 
>>>>>>> GenericHelixController#onLiveInstanceChange
>>>>>>> by
>>>>>>> - dropping non-live instances from the cluster
>>>>>>> - calling rebalance
>>>>>>>
>>>>>>> The instance dropping seems pretty unsafe! Is there a better way?
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Lei Xia
>>>
>>
>>
>
>
> --
>
> *Lei Xia *Senior Software Engineer
> Data Infra/Nuage & Helix
> LinkedIn
>
> [email protected]
> www.linkedin.com/in/lxia1
>

Re: Correct way to redistribute work from disconnected instances?

Reply via email to