[ 
https://issues.apache.org/jira/browse/HBASE-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974513#action_12974513
 ] 

Hudson commented on HBASE-3362:
-------------------------------

Integrated in HBase-TRUNK #1697 (See 
[https://hudson.apache.org/hudson/job/HBase-TRUNK/1697/])
    

> If .META. offline between OPENING and OPENED, then wrong server location in 
> .META. is possible
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3362
>                 URL: https://issues.apache.org/jira/browse/HBASE-3362
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.90.0
>
>
> This is a good one.  It happened to me testing OOME in split logging.
> * Balancer moves region to new location, regionservrer X.
> * New location regionserver X successfully opens the region and then goes to 
> update .META.
> * At this point, the server carrying .META. crashes.
> * Regionserver X is stuck waiting on .META. to come back online.  It takes so 
> long master times out the region-in-transition
> * Master assigns the region elsewhere to regionserver Y
> * It opens successfully on regionserver Y and then it also parks waiting on 
> .META. coming online
> * .META. comes online
> * The two servers X and Y race to update .META.
> I saw case where server X edit went in after server Ys edit which means that 
> lookups in .META. get the wrong server.  HBCK can detect this situation.
> RegionServer X when it wakes up coreeclty notices that its lost control of 
> the region but the damage is done -- where damage is .META. edit.
> Chatting with Jon, he suggested that regionserver X should 'rollback' the 
> .META. edit -- do explicit delete of what it added.  This would work I think 
> but chatting more, I'll make a fix that keeps updating the zookeeper OPENING 
> state while edit goes on in a separate thread.  Our continuous setting of 
> OPENING will make it so region-in-transition does not timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to