[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-11-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492605#comment-13492605
 ] 

stack commented on HBASE-6088:
--

The delete of the zk node should provide the sequence id so we don't delete a 
znode we were not responsible for making.

This seems radical:

{code}
+} catch (KeeperException.NoNodeException nn) {
+  if (abort) {
+server.abort(Failed cleanup of  + hri.getRegionNameAsString(), nn);
+  }
{code}

If we don't find our splitting node we abort?

Good test.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.94.1, 0.96.0

 Attachments: addendum_6088_94.patch, HBASE-6088_92.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_94.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287242#comment-13287242
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-0.92-security #109 (See 
[https://builds.apache.org/job/HBase-0.92-security/109/])
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343819)

 Result = SUCCESS
ramkrishna : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 addendum_6088_94.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287168#comment-13287168
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-0.94-security #33 (See 
[https://builds.apache.org/job/HBase-0.94-security/33/])
HBASE-6088 Addendum fixes testSplitBeforeSettingSplittingInZK (Ram) 
(Revision 1344113)
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343818)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java

ramkrishna : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 addendum_6088_94.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-30 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285423#comment-13285423
 ] 

Zhihong Yu commented on HBASE-6088:
---

Addendum integrated to 0.94 branch.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 addendum_6088_94.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-30 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285427#comment-13285427
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

@Ted
Thanks a lot.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 addendum_6088_94.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285472#comment-13285472
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-0.94 #230 (See 
[https://builds.apache.org/job/HBase-0.94/230/])
HBASE-6088 Addendum fixes testSplitBeforeSettingSplittingInZK (Ram) 
(Revision 1344113)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch, 
 addendum_6088_94.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284664#comment-13284664
 ] 

rajeshbabu commented on HBASE-6088:
---

Updated patches as per Ted comments
Attached patch for 92

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284817#comment-13284817
 ] 

Zhihong Yu commented on HBASE-6088:
---

Minor comment:
{code}
+   * This test case to test the znode is deleted(if created) or not in roll 
back.
{code}
'case to test' - 'case is to test'

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284907#comment-13284907
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

@Ted
The committed patch addresses your last comment.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284960#comment-13284960
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-TRUNK #2943 (See 
[https://builds.apache.org/job/HBase-TRUNK/2943/])
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343817)

 Result = SUCCESS
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284971#comment-13284971
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-0.94 #223 (See 
[https://builds.apache.org/job/HBase-0.94/223/])
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343818)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285034#comment-13285034
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-0.92 #425 (See 
[https://builds.apache.org/job/HBase-0.92/425/])
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343819)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285278#comment-13285278
 ] 

Hudson commented on HBASE-6088:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #31 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/31/])
HBASE-6088 Region splitting not happened for long time due to ZK exception 
while creating RS_ZK_SPLITTING node (Rajesh) (Revision 1343817)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-29 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285378#comment-13285378
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

Failure in 
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK
 in 0.94 is due to the order of the newly added testcase.
In trunk its not a problem as the master restart related testcases are at the 
beginning.  Will provide an addendum for 0.94 sooner.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6088_92.patch, HBASE-6088_94.patch, 
 HBASE-6088_94_2.patch, HBASE-6088_94_3.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch, HBASE-6088_trunk_4.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284461#comment-13284461
 ] 

rajeshbabu commented on HBASE-6088:
---

Updated patch for 94.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
 HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284470#comment-13284470
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

I am planning to commit this. Pls provide your comments.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
 HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284595#comment-13284595
 ] 

rajeshbabu commented on HBASE-6088:
---

@Ted

bq.The second state should be RS_ZK_REGION_SPLIT.
As part of createNodeSplitting we transition from RS_ZK_REGION_SPLITTING to 
RS_ZK_REGION_SPLITTING.

{code}
  int transitionNodeSplitting(final ZooKeeperWatcher zkw, final HRegionInfo 
parent,
  final ServerName serverName, final int version) throws KeeperException, 
IOException {
return ZKAssign.transitionNode(zkw, parent, serverName,
  EventType.RS_ZK_REGION_SPLITTING, EventType.RS_ZK_REGION_SPLITTING, 
version);
  }
{code}

Thats why I have mentioned RS_ZK_REGION_SPLITTING as second state.
Pls correct me if wrong.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
 HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-28 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284601#comment-13284601
 ] 

Zhihong Yu commented on HBASE-6088:
---

Makes sense.
Please make other suggested changes to comment.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_94_2.patch, 
 HBASE-6088_trunk.patch, HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-27 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284154#comment-13284154
 ] 

rajeshbabu commented on HBASE-6088:
---

@Zhihong Yu
Thanks for review,

bq.The above two methods can remain static, right ?

removed static because we cannot override static methods(but subclass methods 
hide superclass methods). The version of the hidden method that gets invoked 
depends on whether it is invoked from the superclass or the subclass.

Correct me if I am wrong.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-27 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284181#comment-13284181
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

@Ted
Basically we changed the static to an instance method so that we can test the 
split transaction.  Infact though it was static the scope was private 
previously.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-27 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284185#comment-13284185
 ] 

Zhihong Yu commented on HBASE-6088:
---

Thanks for the explanation.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch, HBASE-6088_trunk_3.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-26 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284009#comment-13284009
 ] 

rajeshbabu commented on HBASE-6088:
---

Latest patch for trunk and 94

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284012#comment-13284012
 ] 

Hadoop QA commented on HBASE-6088:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12529874/HBASE-6088_94.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2010//console

This message is automatically generated.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284014#comment-13284014
 ] 

Zhihong Yu commented on HBASE-6088:
---

{code}
-  private static int createNodeSplitting(final ZooKeeperWatcher zkw,
-  final HRegionInfo region, final ServerName serverName)
-  throws KeeperException, IOException {
+  int createNodeSplitting(final ZooKeeperWatcher zkw, final HRegionInfo region,
...
+  int transitionNodeSplitting(final ZooKeeperWatcher zkw, final HRegionInfo 
parent,
{code}
The above two methods can remain static, right ?
For transitionNodeSplitting(), please finish the following javadoc:
{code}
+   * @return
{code}

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_94.patch, HBASE-6088_trunk.patch, 
 HBASE-6088_trunk_2.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-25 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283490#comment-13283490
 ] 

rajeshbabu commented on HBASE-6088:
---

Attached patch for trunk. Please review and provide suggestions/comments.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
 Fix For: 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
 {noformat}
 Due to the above exception, region splitting was 

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283544#comment-13283544
 ] 

Hadoop QA commented on HBASE-6088:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12529722/HBASE-6088_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 33 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1995//console

This message is automatically generated.

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6088_trunk.patch


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   

[jira] [Commented] (HBASE-6088) Region splitting not happened for long time due to ZK exception while creating RS_ZK_SPLITTING node

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282518#comment-13282518
 ] 

ramkrishna.s.vasudevan commented on HBASE-6088:
---

While we start doing the split, there are two steps in zk node creation.
- Create the node
- Write the data RS_ZK_SPLITTING into it.
Now after both the steps are completed we make an journal entry.  
Now if writing the data fails even on rollback we are not able to clean the 
node as we don't know the current journal entry.  

  Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 

 Key: HBASE-6088
 URL: https://issues.apache.org/jira/browse/HBASE-6088
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
 Fix For: 0.94.1


 Region splitting not happened for long time due to ZK exception while 
 creating RS_ZK_SPLITTING node
 {noformat}
 2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session 
 timed out, have not heard from server in 26668ms for sessionid 
 0x1377a75f41d0012, closing socket connection and attempting reconnect
 2012-05-24 01:45:41,464 WARN 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper exception: 
 org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
 = ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 {noformat}
 {noformat}
 2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: 
 cleanupCurrentWriter  waiting for transactions to get synced  total 189377 
 synced till here 189365
 2012-05-24 01:45:48,474 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 java.io.IOException: Failed setting SPLITTING znode on 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
   at 
 org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.zookeeper.KeeperException$BadVersionException: 
 KeeperErrorCode = BadVersion for 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
   at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   ... 5 more
 2012-05-24 01:45:48,476 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Successful rollback of 
 failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
 {noformat}
 {noformat}
 2012-05-24 01:47:28,141 ERROR 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is 
 not a retry
 2012-05-24 01:47:28,142 INFO 
 org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup 
 of failed split of 
 ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed 
 create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
 java.io.IOException: Failed create of ephemeral 
 /hbase/unassigned/bd1079bf948c672e493432020dc0e144
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
   at 
 org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
   at