[
https://issues.apache.org/jira/browse/HBASE-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117881#comment-13117881
]
Ming Ma commented on HBASE-4497:
--------------------------------
1. Agree checkAndPut solution is good enough. I am just trying to find holes
here.:)
2. Does RS need to have access to global counter? If it is only for region
assignment scenario, agree there is no such need. I initially thought of it as
a "region operation id" where RS will also get a new ID when state changes, for
example from OPENING to OPENED. We will use such counter to track every region
state change in the system.
3. Persistent .vs. ephemeral. I thought there will be a way to provide reliable
ZK based AtomicLong that can survive HBase, ZK reliable restart. That will give
us a good pictures of the event sequence in the system. Performance isn't that
important given region state happens less frequently.
4. unique .vs. monotonically increase. For this issue, unique number seems to
be fine. I thought it might be used in other context to track event sequence.
So monotonically increase is better given the comparison of two values can
indicate the order in time dimension. It doesn't have to be sequential.
> If region opening fails after updating META HBCK reports it as inconsistent
> and scanning the region throws NSRE
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-4497
> URL: https://issues.apache.org/jira/browse/HBASE-4497
> Project: HBase
> Issue Type: Bug
> Reporter: ramkrishna.s.vasudevan
> Priority: Critical
>
> As per the discussion in the mail chain "HBCK reporting of possible mismatch
> in RS assignment" this JIRA is created.
> Consider two RS-> RS1 and RS2.
> A region tries to open in RS1. But it takes a while. The RS1 has still not
> updated meta and transitioned the node from OPENING to OPENED
> So timeout assigns the region to RS2. RS2 successfully updates the META and
> opens the region.
> Now RS1 tries to act on the region by first updating the META and then
> transiting the node to OPENING to OPENED.
> RS1 transiting the node to OPENING to OPENED will fail. But the META entry
> will have RS1 as the latest.
> Now HBCK reports this as an inconsistency and if we try to scan the Region we
> get NotServingRegionException.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira