[
https://issues.apache.org/jira/browse/PHOENIX-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705309#comment-17705309
]
Istvan Toth edited comment on PHOENIX-6916 at 3/27/23 6:11 PM:
---------------------------------------------------------------
-This in fact is two different issues.-
-The Date issue is that we truncate the trailing zeros from the range key for
fixed length desc fields.-
The other, more fundamental issue that the reverse rowkey logic doesn't seem to
work for variable length fields, and it has probably been always broken. (it is
broken as far back as 4.2, I haven't checked even older releases)
was (Author: stoty):
This in fact is two different issues.
The Date issue is that we truncate the trailing zeros from the range key for
fixed length desc fields.
The other, more fundamental issue that the reverse rowkey logic doesn't seem to
work for variable length fields, and it has probably been always broken. (it is
broken as far back as 4.2, I haven't checked even older releases)
> Degenerate scan generated for timestamp with desc global index
> --------------------------------------------------------------
>
> Key: PHOENIX-6916
> URL: https://issues.apache.org/jira/browse/PHOENIX-6916
> Project: Phoenix
> Issue Type: Bug
> Reporter: Istvan Toth
> Assignee: Istvan Toth
> Priority: Critical
>
> For some non-empty timestamp ranges, Phoenix generates a query plan with a
> degenerate scan if DESC index is created on the timestamp field:
> {noformat}
> create table data (id varchar primary key, ts timestamp);
> create index data_idx on data(ts desc);
> 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP
> '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00';
> +-------------------------------+----------------+---------------+-------------+
> | PLAN | EST_BYTES_READ | EST_ROWS_READ |
> EST_INFO_TS |
> +-------------------------------+----------------+---------------+-------------+
> | DEGENERATE SCAN OVER DATA_IDX | null | null | null
> |
> +-------------------------------+----------------+---------------+-------------+
> 1 row selected (0.012 seconds)
> 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP
> '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:50:00';
> +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> | PLAN
> | EST_BYTES_READ | EST_ROWS_READ |
> EST_INFO_TS |
> +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DATA_IDX
> [~1,677,160,200,000] - [~1,677,159,000,000] | null | null
> | null |
> | SERVER FILTER BY FIRST KEY ONLY
> | null | null | null
> |
> +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> 2 rows selected (0.009 seconds)
> 0: jdbc:phoenix:localhost:59231> explain select /*+NO_INDEX*/ id, ts from
> data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP
> '2023-02-23 13:40:00';
> +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> | PLAN
> | EST_BYTES_READ | EST_ROWS_READ |
> EST_INFO_TS |
> +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER DATA
> | null | null | null
> |
> | SERVER FILTER BY (TS >= TIMESTAMP '2023-02-23 13:30:00.000' AND TS <
> TIMESTAMP '2023-02-23 13:40:00.000') | null | null | null
> |
> +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+
> {noformat}
> Actually, the problem is much more basic:
> {noformat}
> create table ascpk (k varchar primary key);
> create table descpk (k varchar primary key desc);
> upsert into ascpk values ('a');
> upsert into ascpk values ('aa');
> upsert into ascpk values ('aaa');
> upsert into ascpk values ('aaab');
> //Same for descpk
> select * from ascpk;
> +------+
> | K |
> +------+
> | a |
> | aa |
> | aaa |
> | aaab |
> +------+
> 4 rows selected (0.035 seconds)
> select * from descpk;
> +------+
> | K |
> +------+
> | aaab |
> | aaa |
> | aa |
> | a |
> +------+
> select * from ascpk where k between 'a' and 'aaa';
> +-----+
> | K |
> +-----+
> | a |
> | aa |
> | aaa |
> +-----+
> 3 rows selected (0.026 seconds) select * from descpk where k between 'a' and
> 'aaa';
> +---+
> | K |
> +---+
> +---+
> No rows selected (0.022 seconds) {noformat}
> The inversion logic for descending keys completely breaks down when we try to
> use it for keyranges where end key is a prefix of the start key.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)