Lars Hofhansl created HBASE-10257:
-------------------------------------

             Summary: [0.94] Master aborts due to assignment race
                 Key: HBASE-10257
                 URL: https://issues.apache.org/jira/browse/HBASE-10257
             Project: HBase
          Issue Type: Sub-task
            Reporter: Lars Hofhansl
            Assignee: Lars Hofhansl
             Fix For: 0.94.16


# When a region server attempts to open a region and fails it takes the resp. 
znode to PENDING_OPEN followed by FAILED_OPEN in quick succession.
# The HMaster  now gets two notifications from ZK.
# If the znode transitioned to FAILED_OPEN before the HMaster could react to 
PENDING_OPEN. There will be two ClosedRegionHandler running.

That races causes this condition:
{code}
java.lang.IllegalStateException: Unexpected state : 
testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
state=PENDING_OPEN, ts=1372891751912, 
server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
        at 
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
        at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
        at 
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
        at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)
{code}




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to