[jira] [Commented] (HELIX-541) Possible livelock in Helix controller

2014-11-21 Thread kishore gopalakrishna (JIRA)

[ 
https://issues.apache.org/jira/browse/HELIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221280#comment-14221280
 ] 

kishore gopalakrishna commented on HELIX-541:
-

Great find Jason. One thing I dint understand is if the num replica is 2, why 
is node_1 not going to offline. If node_1 reached OFFLINE then the controller 
could have moved node_0 to STANDBY and then node_2 could reach LEADER

 Possible livelock in Helix controller
 ---

 Key: HELIX-541
 URL: https://issues.apache.org/jira/browse/HELIX-541
 Project: Apache Helix
  Issue Type: Bug
Reporter: Zhen Zhang

 We discover a livelock bug in Helix controller. This leads to HELIX-540. 
 Assuming we have 3 partitions and 2 nodes, using LeaderStandby state model, 
 FULL_AUTO mode, and replica is 2. When reaching stable mapping, we might have 
 the following:
 {noformat}
 partition_0:
   node_0: LEADER
   node_1: STANDBY
 partition_1:
   node_1: LEADER
   node_0: STANDBY
 partition_2:
   node_0: LEADER
   node_1: STANDBY
 {noformat}
 Later we add a new node (node_2) to the cluster and rebalancer decides that 
 node_2 should become LEADER for partition_2. So controller first sends 
 OFFLINE-STANDBY transition to node_2, and the mapping becomes:
 {noformat}
 partition_0:
   node_0: LEADER
   node_1: STANDBY
 partition_1:
   node_1: LEADER
   node_0: STANDBY
 partition_2:
   node_0: LEADER
   node_1: STANDBY
   node_2: STANDBY
 {noformat}
 Note that given LEADER state count is 1 and STANDBY state count is R, where 
 R=2, it is implying the following state constraints:
 {noformat}
 LEADER: upper_bound=1
 STANDBY: upper_bound=2
 {noformat}
 Helix controller now enters the livelock: it can't send STANDBY-LEADER to 
 node_2, since this will violate LEADER upper bound; it can't send 
 LEADER-STANDBY to node_0 either, since will violate STANDBY upper bound.
 We can solve the problem in several ways:
 1) State count definition is ambiguous. For some state, like LEADER, when we 
 say state_count=1, that means we can't violate this constraint at any time. 
 However, for some other state, like STANDBY, when we say state_count=R, that 
 means in stable mapping, there should be R-1 STANDBY replicas, but we don't 
 care the count in any transient state. In this case, we can set STANDBY 
 upper_bound to be larger than 2. Note that this doesn't solve the problem in 
 general. We may have some state model that has a restrict requirement on 
 STANDBY state.
 2) Using state transition priority. If we define LEADER-STANDBY transition 
 should have a higher priority than OFFLINE-STANDBY transition, it will solve 
 the livelock. But this doesn't solve the problem in general either, because 
 when the state model gets complicated, it's hard to define and prove proper 
 transition priorities that avoid livelock in any situation.
 3) The root cause of the problem is that Helix controller uses a greedy 
 algorithm that only looks one step ahead. In the example, if Helix controller 
 can look two steps further, then it will find out that sending 
 OFFLINE-STANDBY transition to node_2 will lead to a dead end, therefore it 
 should choose to send LEADER-STANDBY to node_0 instead. In general we might 
 need to do a DFS/BFS and it's hard if state model is complicated and system 
 is large.
 In practice, most systems use simple state models with less than 5 states and 
 have strict state constraint on a single state (e.g MASTER, LEADER) only.  We 
 can avoid livelock by carefully choosing state constraints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HELIX-541) Possible livelock in Helix controller

2014-11-21 Thread Zhen Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HELIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221312#comment-14221312
 ] 

Zhen Zhang commented on HELIX-541:
--

because the new ideal-state is:
{noformat}
node_2: LEADER
node_1: STANDBY
{noformat}

and the current-state is:
{noformat}
node_0: LEADER
node_1: STANDBY
{noformat}

controller figures out it has to send two transitions:
{noformat}
To node_0: LEADER-STANDBY
To node_2: OFFLINE-STANDBY
{noformat}

node_1 remains in STANDBY so controller will not send any transition to it.



 Possible livelock in Helix controller
 ---

 Key: HELIX-541
 URL: https://issues.apache.org/jira/browse/HELIX-541
 Project: Apache Helix
  Issue Type: Bug
Reporter: Zhen Zhang

 We discover a livelock bug in Helix controller. This leads to HELIX-540. 
 Assuming we have 3 partitions and 2 nodes, using LeaderStandby state model, 
 FULL_AUTO mode, and replica is 2. When reaching stable mapping, we might have 
 the following:
 {noformat}
 partition_0:
   node_0: LEADER
   node_1: STANDBY
 partition_1:
   node_1: LEADER
   node_0: STANDBY
 partition_2:
   node_0: LEADER
   node_1: STANDBY
 {noformat}
 Later we add a new node (node_2) to the cluster and rebalancer decides that 
 node_2 should become LEADER for partition_2. So controller first sends 
 OFFLINE-STANDBY transition to node_2, and the mapping becomes:
 {noformat}
 partition_0:
   node_0: LEADER
   node_1: STANDBY
 partition_1:
   node_1: LEADER
   node_0: STANDBY
 partition_2:
   node_0: LEADER
   node_1: STANDBY
   node_2: STANDBY
 {noformat}
 Note that given LEADER state count is 1 and STANDBY state count is R, where 
 R=2, it is implying the following state constraints:
 {noformat}
 LEADER: upper_bound=1
 STANDBY: upper_bound=2
 {noformat}
 Helix controller now enters the livelock: it can't send STANDBY-LEADER to 
 node_2, since this will violate LEADER upper bound; it can't send 
 LEADER-STANDBY to node_0 either, since will violate STANDBY upper bound.
 We can solve the problem in several ways:
 1) State count definition is ambiguous. For some state, like LEADER, when we 
 say state_count=1, that means we can't violate this constraint at any time. 
 However, for some other state, like STANDBY, when we say state_count=R, that 
 means in stable mapping, there should be R-1 STANDBY replicas, but we don't 
 care the count in any transient state. In this case, we can set STANDBY 
 upper_bound to be larger than 2. Note that this doesn't solve the problem in 
 general. We may have some state model that has a restrict requirement on 
 STANDBY state.
 2) Using state transition priority. If we define LEADER-STANDBY transition 
 should have a higher priority than OFFLINE-STANDBY transition, it will solve 
 the livelock. But this doesn't solve the problem in general either, because 
 when the state model gets complicated, it's hard to define and prove proper 
 transition priorities that avoid livelock in any situation.
 3) The root cause of the problem is that Helix controller uses a greedy 
 algorithm that only looks one step ahead. In the example, if Helix controller 
 can look two steps further, then it will find out that sending 
 OFFLINE-STANDBY transition to node_2 will lead to a dead end, therefore it 
 should choose to send LEADER-STANDBY to node_0 instead. In general we might 
 need to do a DFS/BFS and it's hard if state model is complicated and system 
 is large.
 In practice, most systems use simple state models with less than 5 states and 
 have strict state constraint on a single state (e.g MASTER, LEADER) only.  We 
 can avoid livelock by carefully choosing state constraints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)