Re: Changed behavior for WAGED in Helix 1.0.3?

Junkai Xue Tue, 07 Jun 2022 19:19:55 -0700

Hi Phong,

Thanks for leveraging Helix 1.0.3. I have a question for your testing. Will
this test involve enable/disable operation? If yes, it could be a bug that
was caused in 1.0.3, which leads to the instance being disabled through
batch enable/disable. One thing you can verify: check the Cluster Config to
see in map fields of disabled instances whether they contain the instance
coming back.


We are working on the 1.0.4 version to fix that.

Best,

Junkai



On Tue, Jun 7, 2022 at 6:50 PM Phong X. Nguyen <[email protected]>
wrote:

> Helix Team,
>
> We're testing an upgrade to Helix 1.0.3 from Helix 1.0.1 primarily for the
> log4j2 fixes. As we test it, we're discovering that WAGED seems to be
> rebalancing in a slightly different way than before:
>
> Our configuration has 32 instances and 32 partitions. The simpleFields
> configuration is as follows:
>
> "simpleFields" : {
>     "HELIX_ENABLED" : "true",
>     "NUM_PARTITIONS" : "32",
>     "MAX_PARTITIONS_PER_INSTANCE" : "4",
>     "DELAY_REBALANCE_ENABLE" : "true",
>     "DELAY_REBALANCE_TIME" : "30000",
>     "REBALANCE_MODE" : "FULL_AUTO",
>     "REBALANCER_CLASS_NAME" :
> "org.apache.helix.controller.rebalancer.waged.WagedRebalancer",
>     "REPLICAS" : "1",
>     "STATE_MODEL_DEF_REF" : "OnlineOffline",
>     "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
>   }
>
> Out of the 32 instances, we have 2 production test servers, e.g.
> 'server01' and 'server02'.
>
> Previously, if we restarted the application on 'server01' in order to
> deploy some test code, Helix would move one of the partitions over to
> another host, and when 'server01' came back online the partition would be
> rebalanced back. Currently we are not seeing his behavior; the partition
> stays with the other host and does not go back. While this is within the
> constraints of the max partitions, we're confused as to why this might
> happen now.
>
> Have there been any changes to WAGED that might account for this? The
> release notes mentioned that both 1.0.2 and 1.0.3 made some changes to
> Helix.
>
> Thanks,
> - Phong X. Nguyen
>


-- 
Junkai Xue

Re: Changed behavior for WAGED in Helix 1.0.3?

Reply via email to