[ 
https://issues.apache.org/jira/browse/HBASE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122445#comment-14122445
 ] 

Jerry He commented on HBASE-11882:
----------------------------------

Thanks, [~ram_krish], [~anoop.hbase], [~tedyu].

Will go back to work on HBASE-11772 now.

> Row level consistency may not be maintained with bulk load and compaction
> -------------------------------------------------------------------------
>
>                 Key: HBASE-11882
>                 URL: https://issues.apache.org/jira/browse/HBASE-11882
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.99.0, 2.0.0
>            Reporter: Jerry He
>            Assignee: Jerry He
>            Priority: Critical
>             Fix For: 0.99.0, 2.0.0
>
>         Attachments: HBASE-11882-master-v1.patch, 
> HBASE-11882-master-v2.patch, HBASE-11882-master-v3.patch, 
> TestHRegionServerBulkLoad.java.patch
>
>
> While looking into the TestHRegionServerBulkLoad failure for HBASE-11772, I 
> found the root cause is that row level atomicity may not be maintained with 
> bulk load together with compation.
> TestHRegionServerBulkLoad is used to test bulk load atomicity. The test uses 
> multiple threads to do bulk load and scan continuously and do compactions 
> periodically. 
> It verifies row level data is always consistent across column families.
> After HBASE-11591, we added readpoint checks for bulkloaded data using the 
> seqId at the time of bulk load. Now a scanner will not see the data from a 
> bulk load if the scanner's readpoint is earlier than the bulk load seqId.
> Previously, the atomic bulk load result is visible immediately to all 
> scanners.
> The problem is with compaction after bulk load. Compaction does not lock the 
> region and it is done one store (column family) at a time. It also compact 
> away the seqId marker of bulk load.
> Here is an event sequence where the row level consistency is broken.
> 1. A scanner is started to scan a region with cf1 and cf2. The readpoint is 
> 10.
> 2. There is a bulk load that loads into cf1 and cf2. The bulk load seqId is 
> 11. Bulk load is guarded by region write lock. So it is atomic.
> 3. There is a compaction that compacts cf1. It compacts away the seqId marker 
> of the bulk load.
> 4. The scanner tries to next to row-1001. It gets the bulk load data for cf1 
> since there is no seqId preventing it.  It does not get the bulk load data 
> for cf2 since the scanner's readpoint (10) is less than the bulk load seqId 
> (11).
> Now the row level consistency is broken in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to