[
https://issues.apache.org/jira/browse/PHOENIX-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089124#comment-17089124
]
Daniel Wong edited comment on PHOENIX-5833 at 4/21/20, 11:01 PM:
-----------------------------------------------------------------
h3. Error Explanation
So the error was in the separator but not because it was missing.
Prior to the consideration of the andExpresion we have the following 2 key
ranges.
Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *)
Span C -> ['RRS_' - 'RRS'') //note ' is the character after '_'
Essentially as part of the AndExpression we generate the 2 key ranges.
Span A,B -> ('abc\x00def' - *)
Span C -> ['RRS_' - 'RRS'')
When writing the final combined minRange in ScanRanges we start by writing,
'abc\x00def'
Note it is not the final key so we append the asc separator.
'abc\x00def\x00'
Then finally we note that it is exclusive so we call next byte
'abc\x00def\x01'
Here we already now have a key range that will no longer select the row we
intended whose PK was 'abc\x00def\x00RRS_ZYTDT'
h3. Fix
Short term. I think when we truncate a range in the andExpression key
extraction we may need to switch from exclusive to inclusive, as the bound that
was exclusive was covering all 3 keys. The inclusiveness cannot apply to only
2 of those 3 keys.
was (Author: dbwong):
h3. Error Explanation
So the error was in the separator but not because it was missing.
Prior to the consideration of the andExpresion we have the following 2 key
ranges.
Span A,B,C -> ('abc\x00def\x00RRSQ_IMKKL' - *)
Span C -> ['RRS_' - 'RRS'') //note ' is the character after '_'
Essentially as part of the AndExpression we generate the 2 key ranges.
Span A,B -> ('abc\x00def' - *)
Span C -> ['RRS_' - 'RRS'')
When writing the final combined minRange in ScanRanges we start by writing,
'abc\x00def'
Note it is not the final key so we append the asc separator.
'abc\x00def\x00'
Then finally we note that it is exclusive so we call next byte
'abc\x00def\x01'
Here we already now have a key range that will no longer select the row we
intended whose PK was 'abc\x00def\x00RRS_ZYTDT'
h3. Fix
Short term. I think we we truncate a range in the andExpression key extraction
we may need to switch from exclusive to inclusive, as the bound that was
exclusive was covering all 3 keys. The inclusiveness cannot apply to only 2 of
those 3 keys.
> Incorrect results with LIKE operator
> ------------------------------------
>
> Key: PHOENIX-5833
> URL: https://issues.apache.org/jira/browse/PHOENIX-5833
> Project: Phoenix
> Issue Type: Bug
> Components: core
> Affects Versions: 4.15.0
> Reporter: Bharath Vissapragada
> Priority: Critical
>
> Phoenix version: 4.15-HBase-1.5
> -- Create a test table and populate a couple of rows.
> {noformat}
> create table repro_bug(a varchar(10) not null, b varchar(10) not null, c
> varchar(10) not null constraint pk primary key(a, b, c));
> upsert into repro_bug values('abc', 'def', 'RRSQ_IMKKL');
> upsert into repro_bug values('abc', 'def', 'RRS_ZYTDT');
> select * from repro_bug;
> +------+------+-------------+
> | A | B | C |
> +------+------+-------------+
> | abc | def | RRSQ_IMKKL |
> | abc | def | RRS_ZYTDT |
> +------+------+-------------+
> {noformat}
> -- Query 1 - Look for rows where C has a certain prefix - Returns correct
> result
> {noformat}
> select A, B, C from REPRO_BUG where C like 'RRS\\_%';
> +------+------+------------+
> | A | B | C |
> +------+------+------------+
> | abc | def | RRS_ZYTDT |
> +------+------+------------+
> {noformat}
> -- Query 2 - Look for rows where (a, b, c) > first row - Returns correct
> result
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> +------+------+------------+
> | A | B | C |
> +------+------+------------+
> | abc | def | RRS_ZYTDT |
> +------+------+------------+
> {noformat}
> -- Query 3 - Combine the filters from Query 1 and Query2 - Returns incorrect
> result.. Ideally it should return the same row as above.
> {noformat}
> select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def', 'RRSQ_IMKKL')
> AND C like 'RRS\\_%';
> +----+----+----+
> | A | B | C |
> +----+----+----+
> +----+----+----+
> {noformat}
> -- Explain for the above incase someone is interested.
> {noformat}
> explain select A, B, C from REPRO_BUG where (A, B, C) > ('abc', 'def',
> 'RRSQ_IMKKL') AND C like 'RRS\\_%';
> +----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
> | PLAN
> | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
> +----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER REPRO_BUG
> ['abcdef'] - [*] | null | null | null |
> | SERVER FILTER BY FIRST KEY ONLY AND C LIKE 'RRS\_%'
> | null | null | null |
> +----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
> 2 rows selected (0.003 seconds)
> {noformat}
> I'm trying to poke around in the code to figure out the issue but my
> understanding of Phoenix is limited at this point. So creating a bug report
> incase someone can figure this out quickly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)