Hi Bo,

The additional replica could be an intermediate state.
For example, if a replica will be moved to somewhere else, Helix needs to
boot up the new replication before drop the old one. During this process,
there could be more replicas than the config.
Moreover, the error state transition might block normal load balance that
cleans up additional replicas.
Could you please confirm if it is transitional or a stable status? Then
please also verify if there is any error partition in the resource.

About load balance issue, please try our new CRUSH based strategy
"CRUSH-ed", which is designed for even distribution. It is available after
version 0.8.0.

Hope this helps. Thanks.

Best Regards,
Jiajun

On Fri, May 18, 2018 at 3:53 PM, Bo Liu <[email protected]> wrote:

> The cluster level configuration is as below.
>
> allowParticipantAutoJoin        true
>
> DELAY_REBALANCE_DISABLE         false
>
> DELAY_REBALANCE_ENABLE          true
>
> DELAY_REBALANCE_TIME            600000
>
> FAULT_ZONE_TYPE                 pg
>
> MAX_OFFLINE_INSTANCES_ALLOWED   10
>
> TOPOLOGY                        /az/pg/instance
>
> TOPOLOGY_AWARE_ENABLED          true
>
>
> On Fri, May 18, 2018 at 3:51 PM, Bo Liu <[email protected]> wrote:
>
>> Hi folks,
>>
>> We are running Helix in FULL_AUTO mode with the following configurations.
>> The resource is configured to have 4 replicas. However, we noticed that a
>> few partitions actually get 5 replicas (shown in helix UI, and we checked
>> them on the hosts).
>> And we have a few hosts which don't host any partitions.
>>
>> We tried to rebalance the resource through Helix restful API and got no
>> luck.
>> Could you please provide some inputs?
>>
>> Thanks,
>>
>>
>>
>> IDEAL_STATE_MODE        AUTO_REBALANCE
>>
>> MIN_ACTIVE_REPLICAS     2
>>
>> NUM_PARTITIONS          1500
>>
>> REBALANCE_MODE          FULL_AUTO
>>
>> REBALANCE_STRATEGY      org.apache.helix.controller.re
>> balancer.strategy.MultiRoundCrushRebalanceStrategy
>>
>> REBALANCER_CLASS_NAME   org.apache.helix.controller.re
>> balancer.DelayedAutoRebalancer
>>
>> REPLICAS                4
>>
>> STATE_MODEL_DEF_REF     MasterSlave
>>
>> STATE_MODEL_FACTORY_NAME DEFAULT
>>
>>
>>
>>
>> --
>> Best regards,
>> Bo
>>
>>
>
>
> --
> Best regards,
> Bo
>
>

Reply via email to