[ 
https://issues.apache.org/jira/browse/HBASE-26797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501016#comment-17501016
 ] 

Bryan Beaudreault commented on HBASE-26797:
-------------------------------------------

I wrote an acceptance test internally which points our hbase 1 client against 
one of our hbase 2 clusters and sets things up so that there would be an 
orphaned rep_barrier row. The test proves that this change fixes the issue. 
Unfortunately I can't share the code, but it effectively does the following:

 
 * Create a table with a single CF
 * Add REPLICATION_SCOPE = 1 to CF, which will result in rep_barrier rows
 * Split the table twice (resulting in 3 regions, with split points '2222' and 
'5555')
 * run catalog janitor, which cleans up the parent records leaving just the 
orphaned rep_barrier rows
 * merge the first 2 regions, so now there are just 2 regions with split point 
'5555'
 * run catalog janitor again, further cleaning up old records
 * do a RegionLocator.getRegionLocation('2222', true)
 ** This fails pre-patch, but succeeds post-patch

> HBase 1.x clients will choke on rep_barrier rows when scanning hbase 2.x meta
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-26797
>                 URL: https://issues.apache.org/jira/browse/HBASE-26797
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.7.1
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: patch-available
>
> In hbase 2.x, support for serial replication included adding a new CF to meta 
> called rep_barrier. When regions are split or merged, these rep_barrier rows 
> will not be cleaned up. Instead there's a ReplicationBarrierCleaner chore 
> which runs every 12 hours. HBase 2.x clients will ignore these rep_barrier 
> rows, per the [addFamily call in 
> locateRegionInMeta|[https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L929].]
> Encountering these orphan rep_barrier rows causes the hbase 1.x client to 
> fail when it [tries to extract the region location from the meta 
> row|[https://github.com/apache/hbase/blob/branch-1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java#L1340-L1344].]
>  This is a non-recoverable exception, so retries will fail and it will 
> eventually bubble up.
> The immediate fix when encountering this is to run {{{}hbck2 fixMeta{}}}, but 
> we should fix the hbase 1.x client to similarly filter on the CATALOG_FAMILY 
> to avoid these issues altogether.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to