[
https://issues.apache.org/jira/browse/HELIX-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207834#comment-14207834
]
Hudson commented on HELIX-543:
------------------------------
UNSTABLE: Integrated in helix #1303 (See
[https://builds.apache.org/job/helix/1303/])
HELIX-543 RB-27808 Avoid moving partitions unnecessarily when auto-rebalancing
(g.kishore: rev dc9f129b67f8cacdf0cd22288f166b56fc5654a0)
*
helix-core/src/main/java/org/apache/helix/controller/strategy/AutoRebalanceStrategy.java
* helix-agent/helix-agent-0.7.2-SNAPSHOT.ivy
*
helix-core/src/test/java/org/apache/helix/integration/SinglePartitionLeaderStandByTest.java
*
helix-core/src/test/java/org/apache/helix/controller/strategy/TestAutoRebalanceStrategy.java
> Single partition unnecessarily moved
> ------------------------------------
>
> Key: HELIX-543
> URL: https://issues.apache.org/jira/browse/HELIX-543
> Project: Apache Helix
> Issue Type: Bug
> Components: helix-core
> Affects Versions: 0.7.1, 0.6.4
> Reporter: Tom Widmer
> Assignee: kishore gopalakrishna
> Priority: Minor
>
> (Copied from mailing list)
> I have some resources that I use with the OnlineOffine state but which only
> have a single partition at the moment (essentially, Helix is just giving me a
> simple leader election to decide who controls the resource - I don’t care
> which participant has it, as long as only one does). However, with full auto
> rebalance, I find that the ‘first’ instance (alphabetically I think) always
> gets the resource when it’s up. So if I take down the first node so the
> partition transfers to the 2nd node, then bring back up the 1st node, the
> resource transfers back unnecessarily.
> Note that this issue also affects multi-partition resources, it’s just a bit
> less noticeable (it means that with 3 nodes and 4 partitions, say, the
> partitions are always allocated 2, 1, 1, so if you have node 1 down and hence
> 0, 2, 2, and then bring up node 1, it unnecessarily moves 2 partitions to
> make 2, 1, 1 rather than the minimum move to achieve ‘balance’ which would be
> to move 1 partition from instance 2 or 3 back to instance 1.
> I can see the code in question in
> AutoRebalanceStrategy.typedComputePartitionAssignment, where the
> distRemainder is allocated to the first nodes alphabetically, so that the
> capacity of all nodes is not equal.
> The proposed solution is to sort the nodes by the number of partitions they
> already have assigned, which should mean that those nodes are assigned the
> higher capacity and the problem goes away.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)