[ 
https://issues.apache.org/jira/browse/HBASE-27650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault reopened HBASE-27650:
---------------------------------------

Yes, you're right. Re-opening for 2.4 cherry-pick.

> Merging empty regions corrupts meta cache
> -----------------------------------------
>
>                 Key: HBASE-27650
>                 URL: https://issues.apache.org/jira/browse/HBASE-27650
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: patch-available
>             Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
>
> Let's say you have three regions with start keys A, B, C and all are cached 
> in the meta cache. Region B is empty and not getting any requests, and all 3 
> regions are merged together. The new merged region has start key A.
> A user submits a request for row C1, which would previously have gone to 
> region C. That region no longer exists, so the MetaCache returns region C, 
> the request goes out to the server which throws NotServingRegionException. 
> That region C is now removed from the cache, and meta is scanned. The meta 
> scan returns the newly merged region A, which is cached into the MetaCache.
> So now we have a MetaCache where A has been updated with the newly merged 
> RegionInfo, B still exists with the old/deleted RegionInfo, and C has been 
> removed.
> A user submits a request for row C1 again. This _should_ go to region A, but 
> we do cache.floorEntry(C1) which returns the old but still cached region B. 
> We have checks in MetaCache which validate the RegionInfo.getEndKey() against 
> the requested row, and that validation fails because C1 is beyond the endkey 
> of the old region. The cached region B result is ignored and cache returns 
> null. Meta is scanned, and returns the new region A, which is cached again.
> Requests to rows C1+ will still succeed... but they will always require a 
> meta scan because the meta cache will always return that old region B which 
> is invalid and doesn't contain the C1+ rows.
> Currently, the only way this will ever resolve is if a request is sent to 
> region B, which will cause a NotServingRegionException which will finally 
> clear region B from the cache. At that point, requests for C1+ will properly 
> get resolved to region A in the cache.
> I've created a reproducible test case here: 
> [https://gist.github.com/bbeaudreault/c82ff9f8ad0b9424eb987483ede35c12]
> This problem affects both AsyncTable and branch-2's Table.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to