[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181663#comment-15181663 ] Jianwei Cui commented on HBASE-15340: - {quote} The solution of having a client aware readPnt will solve even that(?) {quote} It seems [HBASE-13099|https://issues.apache.org/jira/browse/HBASE-13099] has proposed such solution: https://issues.apache.org/jira/browse/HBASE-13099?focusedCommentId=14337017=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14337017. However, there are cases the solution can't cover(if I am not wrong). For example: 1. the client holds the readPoint when the scanner is created on serverA and the client has read partial row data from serverA 2. move the region to another serverB before the whole row returned 3. before the client created a new scanner for the row with the readPoint on serverB: new mutations applied to the region, including deletes for the row, and a major compaction happens and completed. The major compaction could delete the cells of the row because the new server can't get a proper smallestReadPoint for the compaction before all ongoing scan requests arrived. Then, the client can not read the remaining cells of the row after the compaction, and will break per-row atomicity for scan. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168832#comment-15168832 ] Jianwei Cui commented on HBASE-15340: - {quote} The solution of having a client aware readPnt will solve even that(?) {quota} It seems work IMO, I will try to find whether there is any discussion about this issue. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168829#comment-15168829 ] Jianwei Cui commented on HBASE-15340: - After [HBASE-11544|https://issues.apache.org/jira/browse/HBASE-11544], the maxScannerResultSize of ClientScanner will be 2MB default, this will make server return partial result more easily when size limit reached, and this issue will happen even when the user not set batch for scan. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168785#comment-15168785 ] Anoop Sam John commented on HBASE-15340: Yep. This is a known issue then.. The solution of having a client aware readPnt will solve even that (?) That work has to consider comparability as well. old client -> new RS and reverse. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168771#comment-15168771 ] Jianwei Cui commented on HBASE-15340: - [~anoop.hbase], thanks for your comment, I get your point:). Yes, the case you mentioned will happen. The page https://hbase.apache.org/acid-semantics.html explains the consistency guarantee for scan: {code} A scan is not a consistent view of a table. Scans do not exhibit snapshot isolation. Rather, scans have the following properties: 1. Any row returned by the scan will be a consistent view (i.e. that version of the complete row existed at some point in time) [1] 2. A scan will always reflect a view of the data at least as new as the beginning of the scan. This satisfies the visibility guarantees enumerated below. 1. For example, if client A writes data X and then communicates via a side channel to client B, any scans started by client B will contain data at least as new as X. 2. A scan _must_ reflect all mutations committed prior to the construction of the scanner, and _may_ reflect some mutations committed subsequent to the construction of the scanner. 3. Scans must include all data written prior to the scan (except in the case where data is subsequently mutated, in which case it _may_ reflect the mutation) {code} It seems the consistent for scan only guarantee to read out data at least as new as the beginning of the scan, but no guarantee to whether read out data concurrently written or written after the beginning of the scan. At the end of the page: {code} [1] A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of a row in one RPC then going back to fetch another portion of the row in a subsequent RPC. Intra-row scanning happens when you set a limit on how many values to return per Scan#next (See Scan#setBatch(int)). {code} It mentioned the problem of this jira that row-level consistent view is not guaranteed for intra-row scanning, so this is a known problem? > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168742#comment-15168742 ] Anoop Sam John commented on HBASE-15340: Not just intra row I would say. Even consider a normal Scan. We have writes also in parallel. A row 'r5' (say only one cell in it ) is inserted after begin of the scan. So if there is no region move in btw, we wont see this row at all. The cell will get removed from the return result by the seqId check against the readPnt. But if there is a region move in btw, we may see it. So it is a Q of consistency wrt results as well. Get my point? Just saying.. With intra row results (By setting batch on Scan/ result chunking) this got to be more visible issue > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168736#comment-15168736 ] Jianwei Cui commented on HBASE-15340: - [~anoop.hbase], the intra-row scanning seems come from [HBASE-1537|https://issues.apache.org/jira/browse/HBASE-1537], so that versions after 0.90.0 will have this issue. I will make a patch following the idea and check the result:) > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168727#comment-15168727 ] ramkrishna.s.vasudevan commented on HBASE-15340: bq. When HBASE-15325 is resolved, there is no data miss, however, the returned data may combined from different row-level transactions which is unexpected for application. Ya got it now. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168702#comment-15168702 ] Anoop Sam John commented on HBASE-15340: And this is an issue in all versions of HBase I think. From day one we have this issue (?) > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168700#comment-15168700 ] Anoop Sam John commented on HBASE-15340: After seeing an issue around partial results while region move yday, I was thinking on this .. And the solution you mentioned only came first to my mind as well :-)Ya in case of client recreate scanner (because of NSRE or region moved) the ReadPoint MVCC stuff will get broken. As the new Scanner will have a new readPnt. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168678#comment-15168678 ] Jianwei Cui commented on HBASE-15340: - [~ram_krish], this is a different problem caused by region move when scanning IMO. When [HBASE-15325|https://issues.apache.org/jira/browse/HBASE-15325] is resolved, there is no data miss, however, the returned data may combined from different row-level transactions which is unexpected for application. I think we should also keep the READ_COMMITTED isolation level in this situation? > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168642#comment-15168642 ] ramkrishna.s.vasudevan commented on HBASE-15340: Is this same as https://issues.apache.org/jira/browse/HBASE-15325? Even there it talks about partial row results when the region moves. > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction
[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168630#comment-15168630 ] Jianwei Cui commented on HBASE-15340: - A direct solution is that we can make ClientScanner record the readPoint when the scanner for the region is firstly opened, the following scanners for the same region use the same readPoint if RegionMovedException happens. Any suggestion? > Partial row result of scan may return data violates the row-level transaction > -- > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. >Therefore, the application will get data as: > 'row'column='F:c1' value='value1' > 'row'column='F:c2', value='value2' >The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)