[ https://issues.apache.org/jira/browse/PHOENIX-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Poonia updated PHOENIX-4906: --------------------------------- Summary: Abnormal query result due to merging regions of a salted table (was: Abnormal query result due to Phoenix plan error) > Abnormal query result due to merging regions of a salted table > -------------------------------------------------------------- > > Key: PHOENIX-4906 > URL: https://issues.apache.org/jira/browse/PHOENIX-4906 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.11.0, 4.14.0 > Reporter: JeongMin Ju > Priority: Critical > Attachments: SaltingWithRegionMergeIT.java, > ScanRanges_intersectScan.png, TestSaltingWithRegionMerge.java, > initial_salting_region.png, merged-region.png > > > For a salted table, when a query is made for an entire data target, a > different plan is created depending on the type of the query, and as a > result, erroneous data is retrieved as a result. > {code:java} > // Actually, the schema of the table I used is different, but please ignore > it. > create table if not exists test.test_tale ( > rk1 varchar not null, > rk2 varchar not null, > column1 varchar > constraint pk primary key (rk1, rk2) > ) > ... > SALT_BUCKETS=16... > ; > {code} > > I created a table with 16 salting regions and then wrote a lot of data. > HBase automatically split the region and I did the merging regions for data > balancing between the region servers. > Then, when run the query, you can see that another plan is created according > to the Where clause. > * query1 > select count\(*) from test.test_table; > {code:java} > +-------------------------------------------------------------------------------------------------------+-----------------+----------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | > +-------------------------------------------------------------------------------------------------------+-----------------+----------------+ > | CLIENT 1851-CHUNK 5005959292 ROWS 1944546675532 BYTES PARALLEL 11-WAY FULL > SCAN OVER TEST:TEST_TABLE | 1944546675532 | 5005959292 | > | SERVER FILTER BY FIRST KEY ONLY > | 1944546675532 | 5005959292 | > | SERVER AGGREGATE INTO SINGLE ROW > | 1944546675532 | 5005959292 | > +-------------------------------------------------------------------------------------------------------+-----------------+----------------+ > {code} > * query2 > select count\(*) from test.test_table where rk2 = 'aa'; > {code} > +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | > +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+ > | CLIENT 1846-CHUNK 4992196444 ROWS 1939177965768 BYTES PARALLEL 11-WAY RANGE > SCAN OVER TEST:TEST_TABLE [0] - [15] | 1939177965768 | 4992196444 | > | SERVER FILTER BY FIRST KEY ONLY AND RK2 = 'aa' > | 1939177965768 | 4992196444 | > | SERVER AGGREGATE INTO SINGLE ROW > | 1939177965768 | 4992196444 | > +-------------------------------------------------------------------------------------------------------------------+-----------------+----------------+ > {code} > Since rk2 used in the where clause of query2 is the second column of the PK, > it must be a full scan query like query1. > However, as you can see, query2 is created by range scan and the generated > chunk is also less than five compared to query1. > I added the log and printed out the startkey and endkey of the scan object > generated by the plan. > And I found 5 chunks missing by query2. > All five missing chunks were found in regions where the originally generated > region boundary value was not maintained through the merge operation. > !initial_salting_region.png! > After merging regions > !merged-region.png! > The code that caused the problem is this part. > When a select query is executed, the > [org.apache.phoenix.iterate.BaseResultIterators#getParallelScans|https://github.com/apache/phoenix/blob/v4.11.0-HBase-1.2/phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java#L743-L744] > method creates a Scan object based on the GuidePost in the statistics table. > In the case of a GuidePost that contains a region boundary, it is split into > two Scan objects. The code used here is > [org.apache.phoenix.compile.ScanRanges#intersectScan|https://github.com/apache/phoenix/blob/v4.11.0-HBase-1.2/phoenix-core/src/main/java/org/apache/phoenix/compile/ScanRanges.java#L299-L303]. > !ScanRanges_intersectScan.png! > In the case of a table that has been salted, the code compares it with the > remainder after subtracting the salt(prefix) bytes. > I can not be sure that this code is buggy or intended. > In this case, I have merge the region directly, but it is likely to occur > through HBase's Normalizer function. > I wish other users did not merge the region manually or not the table > property Normalization_enabled to true in their production cluster. If so, > check to see if the initial Salting Region boundary is correct. If the > boundary value has disappeared, you are seeing the wrong data. -- This message was sent by Atlassian Jira (v8.20.10#820010)