[ 
https://issues.apache.org/jira/browse/HBASE-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116981#comment-13116981
 ] 

Ming Ma commented on HBASE-4497:
--------------------------------

Using startcode and timestamp is a good idea. However, I want to confirm if 
there is a case where it won't work. Given there is no such thing as global 
clock, the timestamp value generated by the RS that hosts .META. region at that 
moment might not be unique if .META. region is moved to another RS. So there is 
a possibility of "startcode and timestamp is what i initially thought of. seems 
like there could be some weird situations. for example, what is to say that the 
server already in META didn't somehow become the new assignment destination?". 
Here is how:

1. For a given region, .META. table has RS1 as RS serverName, T1 as timestamp 
value. { RS1, T1 }
2. .META. is moved to another RS whose clock is behind after the original RS 
that wrote {RS1, T1}.
3. RS2 starts openRegion first, it has an older ZK node version to check. RS1 
start openRegion later. It has an up-to-date ZK node version.
4. Both RS2 and RS1 are about the do checkAndPut on .META. table.  Both will 
use {RS1, T1} as condition for checkAndPut.
5. RS1 updates it first, it succeeds. There is a chance that after the update, 
the value is still {RS1, T1}, given T1 is generated by a RS whose clock is 
behind.
6. RS2 updates it next, it also succeeds, given {RS1, T1} hasn't change even 
RS1 makes an update earlier.
7. RS1 has the up-to-date ZK node version, thus it will continue and succeeds 
with the rest of open operatioin. The region is considered OPENED from AM's 
point of view.
8. RS2 has older ZK node version, thus will fail later when it tries to update 
ZK node. Region won't be opened on RS2.
9. In .META. table, the region is on RS2.


Adding support for version check in checkAndPut should address such scenario.


Regarding the "region assignment ID" approach:

1. I didn't imply it will only be incremented by the Master. I suggested a 
ZK-based AtomicLong that Master and all RSs can get hold off. So this could be 
considered a global clock.
2. Such ID could also help to track all the region transition events, 
HBASE-4354.


                
> If region opening fails after updating META HBCK reports it as inconsistent 
> and scanning the region throws NSRE
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4497
>                 URL: https://issues.apache.org/jira/browse/HBASE-4497
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Priority: Critical
>
> As per the discussion in the mail chain "HBCK reporting of possible mismatch 
> in RS assignment" this JIRA is created.
> Consider two RS-> RS1 and RS2.
> A region tries to open in RS1. But it takes a while.  The RS1 has still not 
> updated meta and transitioned the node from OPENING to OPENED
> So timeout assigns the region to RS2.  RS2 successfully updates the META and 
> opens the region.
> Now RS1 tries to act on the region by first updating the META and then 
> transiting the node to OPENING to OPENED.
> RS1 transiting the node to OPENING to OPENED will fail.  But the META entry 
> will have RS1 as the latest.
> Now HBCK reports this as an inconsistency and if we try to scan the Region we 
> get NotServingRegionException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to