[ 
https://issues.apache.org/jira/browse/HBASE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272987#comment-13272987
 ] 

Enis Soztutar commented on HBASE-5986:
--------------------------------------

I have implemented approach 1 by adding split daughters to the returned map 
from MetaScanner.allTableRegions(). But then the problem is that, we are 
returning regions which does not yet exists in the META table, so any 
subsequent getRegion call will fail. 

Thinking a bit more about 3, I think we already guarantee that the region split 
parent, and daughters fall into the same META region. Let's say we have two 
regions region1 and region2, with start keys start_key*, and timestamps ts* 
respectively. 

Before split:
{code} 
<table> <start_key1> <ts1> <encoded_name1>
<table> <start_key2> <ts2> <encoded_name2>
{code} 

Now, if we split region1, daughters will be sorted after region1, and before 
region2:
{code} 
<table> <start_key1> <ts1> <encoded_name1> <offline> <split>
<table> <start_key1> <ts3> <encoded_name1>
<table> <mid_key1> <ts3> <encoded_name1>
<table> <start_key2> <ts2> <encoded_name2>
{code} 

we know this since we have the invariants ts3 > ts1 
(SplitTransaction.getDaughterRegionIdTimestamp()) and start_key1 < mid_key1 < 
start_key2. Even if we have a region boundary between start_key1 and start_key2 
in the META table, the daughters will be co-located with the parent. The only 
exception is that while the user table is split, we have a concurrent split for 
the META table, and the new region boundary is chosen to be between the parent 
and daughters. With some effort, we can prevent this, but it seems to be very 
highly unlikely. 

So, if my analysis is correct, that means option 3 seems like the best choice, 
since this will not complicate the meta scan code. The problem is that, there 
is no internal API to do multi-row transcations other than using the 
coprocessor. Should we think of allowing that w/o coprocessors?

@Lars, does HRegion.mutateRowsWithLock() guarantee that a concurrent scanner 
won't see partial changes? 
                
> Clients can see holes in the META table when regions are being split
> --------------------------------------------------------------------
>
>                 Key: HBASE-5986
>                 URL: https://issues.apache.org/jira/browse/HBASE-5986
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.96.0, 0.94.1
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HBASE-5986-test_v1.patch
>
>
> We found this issue when running large scale ingestion tests for HBASE-5754. 
> The problem is that the .META. table updates are not atomic while splitting a 
> region. In SplitTransaction, there is a time lap between the marking the 
> parent offline, and adding of daughters to the META table. This can result in 
> clients using MetaScanner, of HTable.getStartEndKeys (used by the 
> TableInputFormat) missing regions which are made just offline, but the 
> daughters are not added yet. 
> This is also related to HBASE-4335. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to