Hi Junkai,

Thank you so much for the explanation!

Thanks & Regards,
Grainier Perera.


On Wed, 22 Jun 2022 at 00:54, Junkai Xue <[email protected]> wrote:

> "STANDALONE" means the controller you started just for that cluster
> management.
>
> Usually, in real production to guarantee controllers' high availability,
> we will create a cluster called "super cluster". Controllers join that
> cluster as CONTROLLER_PARTICIPANT. It will manage controllers to decide
> which controller is the leader of which real application cluster.
>
> We will have a tutorial for that later. It should be in open source doc
> but I cannot find it right now.
>
> Best,
>
> Junkai
>
>
> On Sun, Jun 19, 2022 at 10:14 PM Grainier Perera <[email protected]>
> wrote:
>
>> Hi Junkai,
>>
>> Thank you so much. It worked. I've set the controller mode to
>> `STANDALONE` and now everything seems to be working as expected.
>>
>> One small question, does `STANDALONE` means it's using an embedded
>> controller? And is having a `STANDALONE` controller per instance a
>> good idea?
>>
>> Thank you,
>> Grainier Perera.
>>
>>
>> On Mon, 20 Jun 2022 at 00:08, Junkai Xue <[email protected]> wrote:
>>
>>> Ah. I found the problem. I would suggest you to enable this entry for
>>> cluster config. "PERSIST_INTERMEDIATE_ASSIGNMENT":"true"
>>>
>>> It will give you how Helix assignment for FULL_AUTO in IdealState. Once
>>> you enable, you will get which instance it should assign for the resource.
>>> Now it is very clear that, you add you controller instance in your code
>>> as a participant:
>>>
>>> To me, resource 4 is assigned to controller, which does not accept
>>> partition bootstrap:
>>>
>>> {
>>>   "id" : "resource4",
>>>   "simpleFields" : {
>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>     "NUM_PARTITIONS" : "1",
>>>     "REBALANCER_CLASS_NAME" :
>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>     "REBALANCE_DELAY" : "1000",
>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>     "REPLICAS" : "1",
>>>     "STATE_MODEL_DEF_REF" : "OnlineOffline"
>>>   },
>>>   "mapFields" : {
>>>     "resource4_0" : {
>>>       "CEPControllerName-16e8ca90-df6f-4252-9ce8-3efdcce24f4a" : "ONLINE"
>>>     }
>>>   },
>>>   "listFields" : {
>>>     "resource4_0" : [
>>> "CEPControllerName-16e8ca90-df6f-4252-9ce8-3efdcce24f4a" ]
>>>   }
>>> }
>>>
>>> Have a try on your side and do not make the controller as a participant
>>> for that cluster.
>>>
>>> best,
>>>
>>> Junkai
>>>
>>> On Sat, Jun 18, 2022 at 9:49 PM Grainier Perera <[email protected]>
>>> wrote:
>>>
>>>> Hi Junkai,
>>>>
>>>> This is reproducible. Please find the sample code [1]. With this sample;
>>>>
>>>>    - Initially, I'm creating a cluster with 3 instances (Using OOTB
>>>>    `OnlineOfflineStateModelFactory` and WAGED rebalancer...)
>>>>    - Step 1: Adds 6 different resources to the cluster with 1
>>>>    partition and 1 replica each.
>>>>    - Step 2: Adds an additional instance to the cluster.
>>>>    - Step 3: Removes an existing instance from the cluster.
>>>>    - Step 4: Remove all resources.
>>>>
>>>> However, after Step 1, you can see resource1 and resource2 is not
>>>> getting assigned to any Instance.
>>>> c8cep_on_localhost_12000 c8cep_on_localhost_12001
>>>> c8cep_on_localhost_12002
>>>> resource1 - - -
>>>> resource2 ONLINE - -
>>>> resource3 - ONLINE
>>>> resource4 - - ONLINE
>>>> resource5 - - -
>>>> resource6 ONLINE - -
>>>> After other steps also, not every resource is getting rebalanced
>>>> properly.
>>>>
>>>> [1] https://gist.github.com/grainier/055511179d8b4a4f0c678f17889ed853
>>>>
>>>> Thanks,
>>>> Grainier Perera.
>>>>
>>>>
>>>> On Sun, 19 Jun 2022 at 08:32, Junkai Xue <[email protected]> wrote:
>>>>
>>>>> BTW, have you setup proper capacity in InstanceConfig of the only
>>>>> instance?
>>>>>
>>>>> Best,
>>>>>
>>>>> Junkai
>>>>>
>>>>> On Sat, Jun 18, 2022 at 7:10 PM Junkai Xue <[email protected]> wrote:
>>>>>
>>>>>> Interesting. Is this reproducible? We can have a try on your data.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Junkai
>>>>>>
>>>>>> On Sat, Jun 18, 2022 at 4:31 AM Grainier Perera <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Junkai,
>>>>>>>
>>>>>>> I tried removing `MAX_PARTITIONS_PER_INSTANCE`. But it's still the
>>>>>>> same. What's weird is, when I add a few resources, I see some of them 
>>>>>>> still
>>>>>>> not getting into the `ONLINE` state. In the below sample, you can see 
>>>>>>> only
>>>>>>> the 2nd and 4th resources have proper `mapFields`, whereas the 1st and 
>>>>>>> 3rd
>>>>>>> resources don't seem to have any mapping (all of them have the
>>>>>>> same IdealState). However, after a restart, this can change to 1 & 3
>>>>>>> becomes `ONLINE` and 2 & 3 may lose their mapping. But the pattern
>>>>>>> remains... cannot understand why.
>>>>>>>
>>>>>>>
>>>>>>> *ExternalView for _mm:root:_system:cron1:*{
>>>>>>>   "id" : "_mm:root:_system:cron1",
>>>>>>>   "simpleFields" : {
>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>     "REPLICAS" : "1",
>>>>>>>     "STATE_MODEL_DEF_REF" : "NewC8CEPStateModel"
>>>>>>>   },
>>>>>>>   *"mapFields" : { },*
>>>>>>>   "listFields" : { }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> *ExternalView for _mm:root:_system:cron2:*{
>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>   "simpleFields" : {
>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>     "REPLICAS" : "1",
>>>>>>>     "STATE_MODEL_DEF_REF" : "NewC8CEPStateModel"
>>>>>>>   },
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *  "mapFields" : {    "_mm:root:_system:cron2_0" : {
>>>>>>> "c8cep-0.c8cep.c8.svc.cluster.local_12000" : "ONLINE"    }  },*
>>>>>>>   "listFields" : { }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> *ExternalView for _mm:root:_system:cron3:*{
>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>   "simpleFields" : {
>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>     "REPLICAS" : "1",
>>>>>>>     "STATE_MODEL_DEF_REF" : "NewC8CEPStateModel"
>>>>>>>   },
>>>>>>>   *"mapFields" : { },*
>>>>>>>   "listFields" : { }
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> *ExternalView for _mm:root:_system:cron4:*{
>>>>>>>   "id" : "_mm:root:_system:cron4",
>>>>>>>   "simpleFields" : {
>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>     "REPLICAS" : "1",
>>>>>>>     "STATE_MODEL_DEF_REF" : "NewC8CEPStateModel"
>>>>>>>   },
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *  "mapFields" : {    "_mm:root:_system:cron4_0" : {
>>>>>>> "c8cep-0.c8cep.c8.svc.cluster.local_12000" : "ONLINE"    }  },*
>>>>>>>   "listFields" : { }
>>>>>>> }
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Grainier Perera.
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 18 Jun 2022 at 13:21, Junkai Xue <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Then most likely, it caused by this entry of config:
>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>> Usually, we never set this config up. It restricts the assignment
>>>>>>>> for instance. So now you already have one partition from 3_0 assigned. 
>>>>>>>> No
>>>>>>>> other partition can be assigned.
>>>>>>>>
>>>>>>>> So either you remove this entry of config setup or add more
>>>>>>>> instances may help.
>>>>>>>>
>>>>>>>> Please let us know if you have further questions.
>>>>>>>>
>>>>>>>> best,
>>>>>>>>
>>>>>>>> Junkai
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2022 at 11:38 PM Grainier Perera <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Junkai,
>>>>>>>>>
>>>>>>>>> - Correct. I haven't added any rack-aware information.
>>>>>>>>> - I'm connecting 1 instance at the startup and then expanding
>>>>>>>>> on-demand (I've set ALLOW_PARTICIPANT_AUTO_JOIN to true).
>>>>>>>>> - I've checked the live instances and other znodes in Zookeeper.
>>>>>>>>> Everything looks ok, except
>>>>>>>>> /C8CEPCluster/EXTERNALVIEW/_mm:root:_system:cron2 has empty
>>>>>>>>> `mapFields` while
>>>>>>>>> /C8CEPCluster/EXTERNALVIEW/_mm:root:_system:cron3 has `mapFields`
>>>>>>>>> with a ONLINE record. I still cannot understand why? and what I'm 
>>>>>>>>> doing
>>>>>>>>> wrong :(
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 18] get
>>>>>>>>> /C8CEPCluster/CONFIGS/CLUSTER/C8CEPCluster*{
>>>>>>>>>   "id" : "C8CEPCluster",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "allowParticipantAutoJoin" : "true"
>>>>>>>>>   },
>>>>>>>>>   "mapFields" : {
>>>>>>>>>     "DEFAULT_INSTANCE_CAPACITY_MAP" : {
>>>>>>>>>       "MEMORY" : "100",
>>>>>>>>>       "CPU" : "100"
>>>>>>>>>     },
>>>>>>>>>     "DEFAULT_PARTITION_WEIGHT_MAP" : {
>>>>>>>>>       "MEMORY" : "5",
>>>>>>>>>       "CPU" : "5"
>>>>>>>>>     }
>>>>>>>>>   },
>>>>>>>>>   "listFields" : {
>>>>>>>>>     "INSTANCE_CAPACITY_KEYS" : [ "CPU", "MEMORY" ]
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 8] get
>>>>>>>>> /C8CEPCluster/LIVEINSTANCES/c8cep-0.c8cep.c8.svc.cluster.local_12000*
>>>>>>>>> {
>>>>>>>>>   "id" : "c8cep-0.c8cep.c8.svc.cluster.local_12000",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "CURRENT_TASK_THREAD_POOL_SIZE" : "40",
>>>>>>>>>     "HELIX_VERSION" : "1.0.4",
>>>>>>>>>     "LIVE_INSTANCE" : "[email protected]",
>>>>>>>>>     "SESSION_ID" : "106a30539a8003e"
>>>>>>>>>   },
>>>>>>>>>   "mapFields" : { },
>>>>>>>>>   "listFields" : { }
>>>>>>>>> }
>>>>>>>>> [zk: localhost:2181(CONNECTED) 26] get
>>>>>>>>> /C8CEPCluster/CONFIGS/RESOURCE/_mm:root:_system:cron2
>>>>>>>>> {
>>>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>>>   "simpleFields" : { },
>>>>>>>>>   "mapFields" : {
>>>>>>>>>     "PARTITION_CAPACITY_MAP" : {
>>>>>>>>>       "DEFAULT" : "{\"CPU\":\"10\",\"MEMORY\":\"10\"}"
>>>>>>>>>     }
>>>>>>>>>   },
>>>>>>>>>   "listFields" : { }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 27] get
>>>>>>>>> /C8CEPCluster/CONFIGS/RESOURCE/_mm:root:_system:cron3*{
>>>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>>>   "simpleFields" : { },
>>>>>>>>>   "mapFields" : {
>>>>>>>>>     "PARTITION_CAPACITY_MAP" : {
>>>>>>>>>       "DEFAULT" : "{\"CPU\":\"10\",\"MEMORY\":\"10\"}"
>>>>>>>>>     }
>>>>>>>>>   },
>>>>>>>>>   "listFields" : { }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 38] get
>>>>>>>>> /C8CEPCluster/IDEALSTATES/_mm:root:_system:cron2*{
>>>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>   },
>>>>>>>>>   "mapFields" : {
>>>>>>>>>     "_mm:root:_system:cron2_0" : { }
>>>>>>>>>   },
>>>>>>>>>   "listFields" : {
>>>>>>>>>     "_mm:root:_system:cron2_0" : [ ]
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 39] get
>>>>>>>>> /C8CEPCluster/IDEALSTATES/_mm:root:_system:cron3*{
>>>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>   },
>>>>>>>>>   "mapFields" : {
>>>>>>>>>     "_mm:root:_system:cron3_0" : { }
>>>>>>>>>   },
>>>>>>>>>   "listFields" : {
>>>>>>>>>     "_mm:root:_system:cron3_0" : [ ]
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 42] get
>>>>>>>>> /C8CEPCluster/EXTERNALVIEW/_mm:root:_system:cron2*{
>>>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>   },
>>>>>>>>>   *"mapFields" : { },*
>>>>>>>>>   "listFields" : { }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> *[zk: localhost:2181(CONNECTED) 43] get
>>>>>>>>> /C8CEPCluster/EXTERNALVIEW/_mm:root:_system:cron3*{
>>>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>>>   "simpleFields" : {
>>>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>   },
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *"mapFields" : {    "_mm:root:_system:cron3_0" : {
>>>>>>>>> "c8cep-0.c8cep.c8.svc.cluster.local_12000" : "ONLINE"    }  }*,
>>>>>>>>>   "listFields" : { }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>> Grainier Perera.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, 18 Jun 2022 at 10:45, Junkai Xue <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> OK. So you dont put any rackaware information. Then how many
>>>>>>>>>> instances do you have connecting to that cluster? Please double 
>>>>>>>>>> check the
>>>>>>>>>> live instances in Zookeeper as well.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Junkai
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 17, 2022 at 10:01 PM Grainier Perera <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Junkai,
>>>>>>>>>>>
>>>>>>>>>>> I've added cluster init code to the gist [1]. Apart from that,
>>>>>>>>>>> ClusterConfig is configured like this;
>>>>>>>>>>>
>>>>>>>>>>>             ClusterConfig clusterConfig =
>>>>>>>>>>> configAccessor.getClusterConfig(CLUSTER_NAME);
>>>>>>>>>>>             // Configuring the capacity keys in the Cluster
>>>>>>>>>>> Config. For example, MEMORY.
>>>>>>>>>>>
>>>>>>>>>>> clusterConfig.setInstanceCapacityKeys(INSTANCE_CAPACITY_KEYS);
>>>>>>>>>>>             // Configuring the instance capacity in the Instance
>>>>>>>>>>> Config. For example, MEMORY = 100.
>>>>>>>>>>>
>>>>>>>>>>> clusterConfig.setDefaultInstanceCapacityMap(INSTANCE_CAPACITY);
>>>>>>>>>>>             // Configuring the partition weight in the Resource
>>>>>>>>>>> Config. For example, MEMORY = 5.
>>>>>>>>>>>
>>>>>>>>>>> clusterConfig.setDefaultPartitionWeightMap(DEFAULT_RESOURCE_USAGE);
>>>>>>>>>>>             configAccessor.setClusterConfig(CLUSTER_NAME,
>>>>>>>>>>> clusterConfig);
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://gist.github.com/grainier/aa1c0b279ea99f88d74c1e94d79f5cdb#file-clustersetup-java
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Grainier Perera.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, 18 Jun 2022 at 10:00, Junkai Xue <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Could you please share your cluster config as well?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>>
>>>>>>>>>>>> Junkai
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jun 17, 2022 at 8:24 PM Grainier Perera <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Devs,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm trying to add several resources to the cluster using the
>>>>>>>>>>>>> following configurations[1]. However, only some will become 
>>>>>>>>>>>>> `ONLINE`. What
>>>>>>>>>>>>> could be the reason? Is there a way to guarantee every resource 
>>>>>>>>>>>>> will become
>>>>>>>>>>>>> `ONLINE` if WAGED capacity constraints are met?
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can see with the same IdealState, "_mm:root:_system:cron3"
>>>>>>>>>>>>> has mapFields and it is ONLINE, and "_mm:root:_system:cron2"
>>>>>>>>>>>>> is not. Furthermore, I see this behavior more often when the 
>>>>>>>>>>>>> replicas count
>>>>>>>>>>>>> is set to 1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> ResourceInfo:
>>>>>>>>>>>>> 1. "_mm:root:_system:cron2"
>>>>>>>>>>>>>
>>>>>>>>>>>>> IdealState for _mm:root:_system:cron2:
>>>>>>>>>>>>> {
>>>>>>>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>>>>>>>   "simpleFields" : {
>>>>>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   "mapFields" : {
>>>>>>>>>>>>>     "_mm:root:_system:cron2_0" : { }
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   "listFields" : {
>>>>>>>>>>>>>     "_mm:root:_system:cron2_0" : [ ]
>>>>>>>>>>>>>   }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ExternalView for _mm:root:_system:cron2:
>>>>>>>>>>>>> {
>>>>>>>>>>>>>   "id" : "_mm:root:_system:cron2",
>>>>>>>>>>>>>   "simpleFields" : {
>>>>>>>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   *"mapFields" : { },*
>>>>>>>>>>>>>   "listFields" : { }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. "_mm:root:_system:cron3"
>>>>>>>>>>>>>
>>>>>>>>>>>>> IdealState for _mm:root:_system:cron3:
>>>>>>>>>>>>> {
>>>>>>>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>>>>>>>   "simpleFields" : {
>>>>>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   "mapFields" : {
>>>>>>>>>>>>>     "_mm:root:_system:cron3_0" : { }
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   "listFields" : {
>>>>>>>>>>>>>     "_mm:root:_system:cron3_0" : [ ]
>>>>>>>>>>>>>   }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ExternalView for _mm:root:_system:cron3:
>>>>>>>>>>>>> {
>>>>>>>>>>>>>   "id" : "_mm:root:_system:cron3",
>>>>>>>>>>>>>   "simpleFields" : {
>>>>>>>>>>>>>     "BUCKET_SIZE" : "0",
>>>>>>>>>>>>>     "DELAY_REBALANCE_ENABLED" : "true",
>>>>>>>>>>>>>     "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
>>>>>>>>>>>>>     "MAX_PARTITIONS_PER_INSTANCE" : "1",
>>>>>>>>>>>>>     "NUM_PARTITIONS" : "1",
>>>>>>>>>>>>>     "REBALANCER_CLASS_NAME" :
>>>>>>>>>>>>> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>>>>>>>>>>>>>     "REBALANCE_DELAY" : "10000",
>>>>>>>>>>>>>     "REBALANCE_MODE" : "FULL_AUTO",
>>>>>>>>>>>>>     "REPLICAS" : "1",
>>>>>>>>>>>>>     "STATE_MODEL_DEF_REF" : "C8CEPStateModel"
>>>>>>>>>>>>>   },
>>>>>>>>>>>>>   *"mapFields" : {*
>>>>>>>>>>>>> *    "_mm:root:_system:cron3_0" : {*
>>>>>>>>>>>>> *      "c8cep-0.c8cep.c8.svc.cluster.local_12000" : "ONLINE"*
>>>>>>>>>>>>> *    }*
>>>>>>>>>>>>> *  },*
>>>>>>>>>>>>>   "listFields" : { }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]:
>>>>>>>>>>>>> https://gist.github.com/grainier/aa1c0b279ea99f88d74c1e94d79f5cdb
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>> Grainier Perera.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Junkai Xue
>>>>>>>>
>>>>>>>
>>>
>>> --
>>> Junkai Xue
>>>
>>

Reply via email to