[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918333#comment-13918333
 ] 

stack commented on HBASE-10632:
---

Skimmed patch.  lgtm.  +1 for 0.96.  Thanks.

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
 ...
 2014-02-25 15:35:51,945 DEBUG [FifoRpcScheduler.handler1-thread-16] 
 master.HMaster: 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918713#comment-13918713
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-TRUNK #4973 (See 
[https://builds.apache.org/job/HBase-TRUNK/4973/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573723)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918868#comment-13918868
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-0.98 #196 (See 
[https://builds.apache.org/job/HBase-0.98/196/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573725)
* /hbase/branches/0.98
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918918#comment-13918918
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #184 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/184/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573725)
* /hbase/branches/0.98
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918926#comment-13918926
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #106 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/106/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573723)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918946#comment-13918946
 ] 

Hudson commented on HBASE-10632:


SUCCESS: Integrated in hbase-0.96 #324 (See 
[https://builds.apache.org/job/hbase-0.96/324/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573727)
* /hbase/branches/0.96
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918945#comment-13918945
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in hbase-0.96-hadoop2 #223 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/223/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573727)
* /hbase/branches/0.96
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-02 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917718#comment-13917718
 ] 

Enis Soztutar commented on HBASE-10632:
---

bq. Will there be patches here or should we have backport issues?
Attached patch applies to trunk, 0.98 and 0.96. 
Will commit it tomorrow. 

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917721#comment-13917721
 ] 

Andrew Purtell commented on HBASE-10632:


Thanks Enis, very much appreciated

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
 ...
 2014-02-25 15:35:51,945 DEBUG [FifoRpcScheduler.handler1-thread-16] 
 master.HMaster: 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916174#comment-13916174
 ] 

Enis Soztutar commented on HBASE-10632:
---

This should affect 0.98 and 0.96 code lines as well. 

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
 ...
 2014-02-25 15:35:51,945 DEBUG [FifoRpcScheduler.handler1-thread-16] 
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13916759#comment-13916759
 ] 

Andrew Purtell commented on HBASE-10632:


bq. This should affect 0.98 and 0.96 code lines as well.

Will there be patches here or should we have backport issues?

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
 ...
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915439#comment-13915439
 ] 

Ted Yu commented on HBASE-10632:


lgtm
nit:
{code}
+(serversToIndex.get(loc.get(i).getHostAndPort()) == null ? 
-1 : serversToIndex.get(loc.get(i).getHostAndPort()));
{code}

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
 {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
 master.AssignmentManager: Znode 
 IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
 ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 executor.EventHandler: Caught throwable while processing event 
 M_SERVER_SHUTDOWN
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.init(BaseLoadBalancer.java:250)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
   at 
 org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
   at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 After that, region is left in limbo and is never reassigned.
 {noformat}
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.HMaster: Client=hrt_qa//68.142.246.29 move 
 hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
 dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
 master.AssignmentManager: Ignored moving region not assigned: {ENCODED = 
 24d68aa7239824e42390a77b7212fcbf, NAME = 
 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
  STARTKEY = '\x80\x06\x1A', ENDKEY = ''}, {24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
 ...
 2014-02-25 15:35:26,586 DEBUG 
 [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
 master.HMaster: Not running balancer because 1 region(s) in transition: 
 {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
 state=OFFLINE, ts=1393342242623, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915501#comment-13915501
 ] 

Hadoop QA commented on HBASE-10632:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631658/hbase-10632_v1.patch
  against trunk revision .
  ATTACHMENT ID: 12631658

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  regionsPerServer[serverIndex] = new int[entry.getValue().size() 
+ regionsPerServer[serverIndex].length];
+(serversToIndex.get(loc.get(i).getHostAndPort()) == null ? 
-1 : serversToIndex.get(loc.get(i).getHostAndPort()));

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//console

This message is automatically generated.

 Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
 ---

 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.99.0, hbase-10070

 Attachments: hbase-10632_v1.patch


 Discovered while running IntegrationTestBigLinkedList. Region 
 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
 hor13n13. During the process an exception is thrown.
 {noformat}
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
 state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
 for hor13n19.gq1.ygridcore.net,60020,1393341563552
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning 7 region(s) that 
 hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
 that were opening on this server)
 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
 handler.ServerShutdownHandler: Reassigning region with rs = 
 {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
 server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
 if exists
 2014-02-25 15:30:42,623