Ted Yu created HBASE-15056:
------------------------------

             Summary: Split fails with KeeperException$NoNodeException when 
namespace quota is enabled
                 Key: HBASE-15056
                 URL: https://issues.apache.org/jira/browse/HBASE-15056
             Project: HBase
          Issue Type: Bug
    Affects Versions: 1.2.0
            Reporter: Ted Yu


When trying to port HBASE-15044 to branch-1, I found that region split fails 
with KeeperException$NoNodeException when namespace quota is enabled and the 
split would exceed allocated quota.

Here is related test output:
{code}
2015-12-30 09:50:16,764 WARN  [RS:0;10.22.24.71:65256-splits-1451497816754] 
zookeeper.ZKAssign(885): regionserver:65256-0x151f402c21c0001, 
quorum=localhost:57662, baseZNode=/    hbase Attempt to transition the 
unassigned node for 17fc99c04a8027b653e9d5ef5d578461 from 
RS_ZK_REQUEST_REGION_SPLIT to RS_ZK_REQUEST_REGION_SPLIT failed, the node 
existed and   was in the expected state but then when setting data it no longer 
existed
2015-12-30 09:50:16,866 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] 
zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, 
quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode 
/hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node does 
not exist (not necessarily an error)
2015-12-30 09:50:16,866 INFO  [RS:0;10.22.24.71:65256-splits-1451497816754] 
regionserver.SplitRequest(97): Running rollback/cleanup of failed split of np2: 
                      
testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.;
 Failed getting SPLITTING znode on 
np2:testRegionNormalizationSplitOnCluster,zzzzz,   
1451497806295.17fc99c04a8027b653e9d5ef5d578461.
java.io.IOException: Failed getting SPLITTING znode on 
np2:testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.
  at 
org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:200)
  at 
org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:381)
  at 
org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:277)
  at 
org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:560)
  at 
org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
  at 
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Data is null, splitting node 
17fc99c04a8027b653e9d5ef5d578461 no longer exists
  at 
org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:166)
  ... 8 more
2015-12-30 09:50:16,869 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] 
zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, 
quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode 
/hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node does 
not exist (not necessarily an error)
2015-12-30 09:50:16,869 INFO  [RS:0;10.22.24.71:65256-splits-1451497816754] 
coordination.ZKSplitTransactionCoordination(268): Failed cleanup zk node of 
np2:                      
testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
  at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:452)
  at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:381)
  at 
org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.clean(ZKSplitTransactionCoordination.java:261)
  at 
org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:948)
  at 
org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:900)
  at 
org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:99)
  at 
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
{code}
Strangely there is no QuotaExceededException thrown.
In master branch, quota check is done in response to 
TransitionCode.READY_TO_SPLIT
In branch-1, that code path wouldn't be executed when useZKForAssignment is 
true (the default case):
{code}
    } else if (services != null && !useZKForAssignment) {
      if (!services.reportRegionStateTransition(TransitionCode.READY_TO_SPLIT,
          parent.getRegionInfo(), hri_a, hri_b)) {
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to