Gopinathan A created HBASE-6088:
-----------------------------------
Summary: Region splitting not happened for long time due to ZK
exception while creating RS_ZK_SPLITTING node
Key: HBASE-6088
URL: https://issues.apache.org/jira/browse/HBASE-6088
Project: HBase
Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Gopinathan A
Fix For: 0.94.1
Region splitting not happened for long time due to ZK exception while creating
RS_ZK_SPLITTING node
{noformat}
2012-05-24 01:45:41,363 INFO org.apache.zookeeper.ClientCnxn: Client session
timed out, have not heard from server in 26668ms for sessionid
0x1377a75f41d0012, closing socket connection and attempting reconnect
2012-05-24 01:45:41,464 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/unassigned/bd1079bf948c672e493432020dc0e144
{noformat}
{noformat}
2012-05-24 01:45:43,300 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog:
cleanupCurrentWriter waiting for transactions to get synced total 189377
synced till here 189365
2012-05-24 01:45:48,474 INFO org.apache.hadoop.hbase.regionserver.SplitRequest:
Running rollback/cleanup of failed split of
ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed
setting SPLITTING znode on
ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
java.io.IOException: Failed setting SPLITTING znode on
ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:242)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
at
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.zookeeper.KeeperException$BadVersionException:
KeeperErrorCode = BadVersion for
/hbase/unassigned/bd1079bf948c672e493432020dc0e144
at org.apache.zookeeper.KeeperException.create(KeeperException.java:115)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:321)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:659)
at
org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:811)
at
org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:747)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.transitionNodeSplitting(SplitTransaction.java:919)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:869)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
... 5 more
2012-05-24 01:45:48,476 INFO org.apache.hadoop.hbase.regionserver.SplitRequest:
Successful rollback of failed split of
ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.
{noformat}
{noformat}
2012-05-24 01:47:28,141 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
/hbase/unassigned/bd1079bf948c672e493432020dc0e144 already exists and this is
not a retry
2012-05-24 01:47:28,142 INFO org.apache.hadoop.hbase.regionserver.SplitRequest:
Running rollback/cleanup of failed split of
ufdr,011365398471659,1337823505339.bd1079bf948c672e493432020dc0e144.; Failed
create of ephemeral /hbase/unassigned/bd1079bf948c672e493432020dc0e144
java.io.IOException: Failed create of ephemeral
/hbase/unassigned/bd1079bf948c672e493432020dc0e144
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.createNodeSplitting(SplitTransaction.java:865)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:239)
at
org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:450)
at
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
{noformat}
Due to the above exception, region splitting was failing contineously more than
5hrs
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira