[
https://issues.apache.org/jira/browse/PHOENIX-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Poonia updated PHOENIX-4906:
---------------------------------
Summary: Abnormal query result due to merging regions of a salted table
(was: Abnormal query result due to Phoenix plan error)
> Abnormal query result due to merging regions of a salted table
> --------------------------------------------------------------
>
> Key: PHOENIX-4906
> URL: https://issues.apache.org/jira/browse/PHOENIX-4906
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.11.0, 4.14.0
> Reporter: JeongMin Ju
> Priority: Critical
> Attachments: SaltingWithRegionMergeIT.java,
> ScanRanges_intersectScan.png, TestSaltingWithRegionMerge.java,
> initial_salting_region.png, merged-region.png
>
>
> For a salted table, when a query is made for an entire data target, a
> different plan is created depending on the type of the query, and as a
> result, erroneous data is retrieved as a result.
> {code:java}
> // Actually, the schema of the table I used is different, but please ignore
> it.
> create table if not exists test.test_tale (
> rk1 varchar not null,
> rk2 varchar not null,
> column1 varchar
> constraint pk primary key (rk1, rk2)
> )
> ...
> SALT_BUCKETS=16...
> ;
> {code}
>
> I created a table with 16 salting regions and then wrote a lot of data.
> HBase automatically split the region and I did the merging regions for data
> balancing between the region servers.
> Then, when run the query, you can see that another plan is created according
> to the Where clause.
> * query1
> select count\(*) from test.test_table;
> {code:java}
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+
> | PLAN
> | EST_BYTES_READ | EST_ROWS_READ |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+
> | CLIENT 1851-CHUNK 5005959292 ROWS 1944546675532 BYTES PARALLEL 11-WAY FULL
> SCAN OVER TEST:TEST_TABLE | 1944546675532 | 5005959292 |
> | SERVER FILTER BY FIRST KEY ONLY
> | 1944546675532 | 5005959292 |
> | SERVER AGGREGATE INTO SINGLE ROW
> | 1944546675532 | 5005959292 |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+
> {code}
> * query2
> select count\(*) from test.test_table where rk2 = 'aa';
> {code}
> +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+
> | PLAN
> | EST_BYTES_READ | EST_ROWS_READ |
> +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+
> | CLIENT 1846-CHUNK 4992196444 ROWS 1939177965768 BYTES PARALLEL 11-WAY RANGE
> SCAN OVER TEST:TEST_TABLE [0] - [15] | 1939177965768 | 4992196444 |
> | SERVER FILTER BY FIRST KEY ONLY AND RK2 = 'aa'
> | 1939177965768 | 4992196444 |
> | SERVER AGGREGATE INTO SINGLE ROW
> | 1939177965768 | 4992196444 |
> +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+
> {code}
> Since rk2 used in the where clause of query2 is the second column of the PK,
> it must be a full scan query like query1.
> However, as you can see, query2 is created by range scan and the generated
> chunk is also less than five compared to query1.
> I added the log and printed out the startkey and endkey of the scan object
> generated by the plan.
> And I found 5 chunks missing by query2.
> All five missing chunks were found in regions where the originally generated
> region boundary value was not maintained through the merge operation.
> !initial_salting_region.png!
> After merging regions
> !merged-region.png!
> The code that caused the problem is this part.
> When a select query is executed, the
> [org.apache.phoenix.iterate.BaseResultIterators#getParallelScans|https://github.com/apache/phoenix/blob/v4.11.0-HBase-1.2/phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java#L743-L744]
> method creates a Scan object based on the GuidePost in the statistics table.
> In the case of a GuidePost that contains a region boundary, it is split into
> two Scan objects. The code used here is
> [org.apache.phoenix.compile.ScanRanges#intersectScan|https://github.com/apache/phoenix/blob/v4.11.0-HBase-1.2/phoenix-core/src/main/java/org/apache/phoenix/compile/ScanRanges.java#L299-L303].
> !ScanRanges_intersectScan.png!
> In the case of a table that has been salted, the code compares it with the
> remainder after subtracting the salt(prefix) bytes.
> I can not be sure that this code is buggy or intended.
> In this case, I have merge the region directly, but it is likely to occur
> through HBase's Normalizer function.
> I wish other users did not merge the region manually or not the table
> property Normalization_enabled to true in their production cluster. If so,
> check to see if the initial Salting Region boundary is correct. If the
> boundary value has disappeared, you are seeing the wrong data.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)