[ 
https://issues.apache.org/jira/browse/HELIX-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14210192#comment-14210192
 ] 

Zhen Zhang commented on HELIX-547:
----------------------------------

it's related to HELIX-540 and HELIX-541

> AutoRebalancer may not converge in some rare situation
> ------------------------------------------------------
>
>                 Key: HELIX-547
>                 URL: https://issues.apache.org/jira/browse/HELIX-547
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Zhen Zhang
>
> We discovered that AutoRebalancer may not converge to a stable mapping in 
> some rare situation. Assume we have a DB with 1024 partitions; using 
> LeaderStandby state model; replica is 1; 6 nodes which are all alive. The 
> current mapping is:
> {noformat}
> ...
> MyDB_873={localhost_5=LEADER}
> ...
> {noformat}
> Given:
> {noformat}
> allNodes=allLiveNodes={localhost_0, ..., localhost_5}
> stateCountMap: {LEADER=1, STANDBY=0}
> capacity: 2147483647
> {noformat}
> AutoRebalanceStrategy#computePartitionAssignment will output new mapping:
> {noformat}
> ...
> MyDB_873={localhost_1=LEADER}
> ...
> {noformat}
> Then Helix controller will send LEADER->STANDBY to localhost_5, and 
> OFFLINE->STANDBY to localhost_1, so next time when auto rebalancer is 
> triggered, the current mapping becomes:
> {noformat}
> ...
> MyDB_873={localhost_5=STANDBY, localhost_1=STANDBY}
> ...
> {noformat}
> In this case, AutoRebalanceStrategy#computePartitionAssignment will output 
> new mapping:
> {noformat}
> ...
> MyDB_873={localhost_5=LEADER}
> ...
> {noformat}
> Thus AutoRebalanceStrategy#computePartitionAssignment keeps assign 
> localhost_1 and localhost_5 to MyDB_873 alternatively without converging to a 
> stable mapping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to