[ 
https://issues.apache.org/jira/browse/HBASE-20182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16396632#comment-16396632
 ] 

Duo Zhang commented on HBASE-20182:
-----------------------------------

OK I saw the code in CatalogJanitor...

So we need to add a filter when locating region in meta to skip the split 
parent region? And this should go into all active branches.

And is it possible to introduce a new family to store the parent for split and 
merge? I think this will make the CatalogJanitor more lightweight since it do 
not need to scan the catalog family any more. And then we can completely remove 
the content in catalog family after split.

What do you think? [~stack] ? I think the latter one can be a new issue, and 
does not need to be done before 2.0.0 release.

Thanks.



> Can not locate region after split and merge
> -------------------------------------------
>
>                 Key: HBASE-20182
>                 URL: https://issues.apache.org/jira/browse/HBASE-20182
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Duo Zhang
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: HBASE-20182-UT.patch
>
>
> When implementing serial replication feature in HBASE-20046, I found that 
> when splitting a region, we will not remove the parent region, instead we 
> will mark it offline.
> And when locating a region, we will only scan one row so if we locate to the 
> offlined region then we are dead.
> This will not happen for splitting, since one of the new daughter regions 
> have the same start row with the parent region, and the timestamp is greater 
> so when doing reverse scan we will always hit the daughter first.
> But if we also consider merge then bad things happen. Consider we have two 
> regions A and B, we split B to C and D, and then merge A and C to E, then 
> ideally the regions should be E and D, but actually the regions in meta will 
> be E, B and D, and they all have different start rows. If you use a row 
> within the range of old region C, then we will always locate to B and throw 
> exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to