[jira] [Commented] (HBASE-4796) Race between SplitRegionHandlers for the same region kills the master

Hadoop QA (Commented) (JIRA) Wed, 16 Nov 2011 12:34:14 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151489#comment-13151489
 ]


Hadoop QA commented on HBASE-4796:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503920/4796.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified 
tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -163 warning 
messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    -1 findbugs.  The patch appears to introduce 51 new Findbugs (version 
1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/266//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/266//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/266//console

This message is automatically generated.
                
> Race between SplitRegionHandlers for the same region kills the master
> ---------------------------------------------------------------------
>
>                 Key: HBASE-4796
>                 URL: https://issues.apache.org/jira/browse/HBASE-4796
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: 4796.txt
>
>
> I just saw that multiple SplitRegionHandlers can be created for the same 
> region because of the RS tickling, but it becomes deadly when more than 1 are 
> trying to delete the znode at the same time:
> {quote}
> 2011-11-16 02:25:28,778 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_SPLIT, server=sv4r7s38,62023,1321410237387, 
> region=f80b6a904048a99ce88d61420b8906d1
> 2011-11-16 02:25:28,780 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_SPLIT, server=sv4r7s38,62023,1321410237387, 
> region=f80b6a904048a99ce88d61420b8906d1
> 2011-11-16 02:25:28,796 DEBUG 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handling SPLIT 
> event for f80b6a904048a99ce88d61420b8906d1; deleting node
> 2011-11-16 02:25:28,798 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:62003-0x132f043bbde094b Deleting existing unassigned node for 
> f80b6a904048a99ce88d61420b8906d1 that is in expected state RS_ZK_REGION_SPLIT
> 2011-11-16 02:25:28,804 DEBUG 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handling SPLIT 
> event for f80b6a904048a99ce88d61420b8906d1; deleting node
> 2011-11-16 02:25:28,806 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:62003-0x132f043bbde094b Deleting existing unassigned node for 
> f80b6a904048a99ce88d61420b8906d1 that is in expected state RS_ZK_REGION_SPLIT
> 2011-11-16 02:25:28,821 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:62003-0x132f043bbde094b Successfully deleted unassigned node for 
> region f80b6a904048a99ce88d61420b8906d1 in expected state RS_ZK_REGION_SPLIT
> 2011-11-16 02:25:28,821 INFO 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT 
> report); 
> parent=TestTable,0000006304,1321409743253.f80b6a904048a99ce88d61420b8906d1. 
> daughter 
> a=TestTable,0000006304,1321410325564.e0f5d201683bcabe14426817224334b8.daughter
>  b=TestTable,0000007054,1321410325564.1b82eeb5d230c47ccc51c08256134839.
> 2011-11-16 02:25:28,829 WARN 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/f80b6a904048a99ce88d61420b8906d1 already deleted, and this 
> is not a retry
> 2011-11-16 02:25:28,830 FATAL org.apache.hadoop.hbase.master.HMaster: Error 
> deleting SPLIT node in ZK for transition ZK node 
> (f80b6a904048a99ce88d61420b8906d1)
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /hbase/unassigned/f80b6a904048a99ce88d61420b8906d1
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>       at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:107)
>       at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:884)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:506)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:453)
>       at 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler.process(SplitRegionHandler.java:95)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> {quote}
> Stack and I came up with the solution that we need just manage that exception 
> because handleSplitReport is an in-memory thing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4796) Race between SplitRegionHandlers for the same region kills the master

Reply via email to