[ 
https://issues.apache.org/jira/browse/PHOENIX-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754428#comment-17754428
 ] 

Istvan Toth edited comment on PHOENIX-6673 at 8/15/23 6:13 AM:
---------------------------------------------------------------

I agree, your second test case above implies that we're not always taking into 
account the trailing variable length fields of the PK when calculating the 
split point, and this causes the problem.

If you are up to it, please add a test case for this issue (you can use my 
attached patch as a starting point), and fix the split generation to take the 
all PK fields in account.

Ideally, we would somehow add this logic to the split policy, so that we also 
close the possibility of specifying a bad split point, but this secondary to 
fixing SchemaUtil.processSplits() .


was (Author: stoty):
I agree, your second test case above imply that we're not always taking into 
account the unspecified rows - at least trailing variable length fields - of 
the PK when calculating the split point, and this causes the problem.

If you are up to it, please add a test case for this issue (you can use my 
attached patch as a starting point), and fix the split generation to take the 
all PK fields in account.

Ideally, we would somehow add this logic to the split policy, so that we also 
close the possibility of specifying a bad split point, but this secondary to 
fixing SchemaUtil.processSplits() .

> Local indexing broken by manually splitting table at arbitrary point
> --------------------------------------------------------------------
>
>                 Key: PHOENIX-6673
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6673
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5.2.0
>            Reporter: Istvan Toth
>            Priority: Major
>         Attachments: PHOENIX-6673-repro.patch
>
>
> While working on PHOENIX-6587, I found that splitting tables with local 
> indexes on certain points will break the local indexing code, and result in 
> incorrect query results.
> When a table is pre-split by Phoenix, or automatically split by HBase, then 
> split points always have a minimum length that is equal to the possible 
> minimum length of the table rowkey. 
> The automatic split always happens at an existing rowkey, and 
> SchemaUtil.processSplits() has code that approximates the same behaviour for 
> pre-split tables.
> However, it is still possible to split the table manually from HBase at 
> points that do not satisfy the above requirement, which breaks local indexing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to