[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168727#comment-15168727
 ] 

ramkrishna.s.vasudevan commented on HBASE-15340:
------------------------------------------------

bq. When HBASE-15325 is resolved, there is no data miss, however, the returned 
data may combined from different row-level transactions which is unexpected for 
application. 
Ya got it now.

> Partial row result of scan may return data violates the row-level transaction 
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-15340
>                 URL: https://issues.apache.org/jira/browse/HBASE-15340
>             Project: HBase
>          Issue Type: Bug
>          Components: Scanners, Transactions/MVCC
>    Affects Versions: 2.0.0
>            Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
>     > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
>     > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>    Therefore, the application will get data as:
> 'row'    column='F:c1'   value='value1'
> 'row'    column='F:c2',  value='value2'
>    The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to