> On Nov. 21, 2014, 7:10 p.m., Kishore Gopalakrishna wrote: > > I am not convinced this is the right thing to do yet. Can we hold on to > > this. See my comment on https://issues.apache.org/jira/browse/HELIX-541 > > Kishore Gopalakrishna wrote: > I agree with the work around but I am not clear about the behavior in > described in 541. The root cause might be something else.
Assume current state is: Node_0: LEADER Node_1: STANDBY Since we are using full auto, and assume Node_0 holds more partitions than some other nodes, so the rebalancer is trying to "migrate" some partitions from Node_0 to some other nodes that haven't reached their capacity yet. In this case, the rebalancer comes up with a new ideal-state: Node_2: LEADER Node_1: STANDBY Now it's the Helix controller's resposiblity to move from current-state to the new ideal-state. Thinking about all possible intermediate mappings as a graph, there are multiple paths to walk from current state to ideal state: Option1) Send LEADER->STANDBY to Node_0 Option2) Send OFFLINE->STANDBY to Node_2 If controller chooses Option1, it goes to an dead-end. The root cause of the problem is that Helix controller uses a greedy algorithm that only looks one step ahead. Given a graph and contraints on the graph, greedy algorithm can't gurantee to find a feasible path. - Zhen ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28342/#review62601 ----------------------------------------------------------- On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28342/ > ----------------------------------------------------------- > > (Updated Nov. 21, 2014, 7:05 p.m.) > > > Review request for helix and Shi Lu. > > > Repository: helix-git > > > Description > ------- > > [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state > model > This is a workaround to avoid livelock in Helix controller @see HELIX-541 > > > Diffs > ----- > > > helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java > b8b3aeb > > Diff: https://reviews.apache.org/r/28342/diff/ > > > Testing > ------- > > > Thanks, > > Zhen Zhang > >
