[ https://issues.apache.org/jira/browse/PHOENIX-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Istvan Toth updated PHOENIX-6916: --------------------------------- Description: For some non-empty timestamp ranges, Phoenix generates a query plan with a degenerate scan if DESC index is created on the timestamp field: {noformat} create table data (id varchar primary key, ts timestamp); create index data_idx on data(ts desc); 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00'; +-------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +-------------------------------+----------------+---------------+-------------+ | DEGENERATE SCAN OVER DATA_IDX | null | null | null | +-------------------------------+----------------+---------------+-------------+ 1 row selected (0.012 seconds) 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:50:00'; +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DATA_IDX [~1,677,160,200,000] - [~1,677,159,000,000] | null | null | null | | SERVER FILTER BY FIRST KEY ONLY | null | null | null | +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ 2 rows selected (0.009 seconds) 0: jdbc:phoenix:localhost:59231> explain select /*+NO_INDEX*/ id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00'; +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER DATA | null | null | null | | SERVER FILTER BY (TS >= TIMESTAMP '2023-02-23 13:30:00.000' AND TS < TIMESTAMP '2023-02-23 13:40:00.000') | null | null | null | +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ {noformat} Actually, the problem is much more basic: {noformat} create table ascpk (k varchar primary key); create table descpk (k varchar primary key desc); upsert into ascpk values ('a'); upsert into ascpk values ('aa'); upsert into ascpk values ('aaa'); upsert into ascpk values ('aaab'); //Same for descpk select * from ascpk; +------+ | K | +------+ | a | | aa | | aaa | | aaab | +------+ 4 rows selected (0.035 seconds) select * from descpk; +------+ | K | +------+ | aaab | | aaa | | aa | | a | +------+ select * from ascpk where k between 'a' and 'aaa'; +-----+ | K | +-----+ | a | | aa | | aaa | +-----+ 3 rows selected (0.026 seconds) select * from descpk where k between 'a' and 'aaa'; +---+ | K | +---+ +---+ No rows selected (0.022 seconds) {noformat} The inversion logic for descending keys completely breaks down when we try to use it for keyranges where start key is a prefix of the end key. was: For some non-empty timestamp ranges, Phoenix generates a query plan with a degenerate scan if DESC index is created on the timestamp field: {noformat} create table data (id varchar primary key, ts timestamp); create index data_idx on data(ts desc); 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00'; +-------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +-------------------------------+----------------+---------------+-------------+ | DEGENERATE SCAN OVER DATA_IDX | null | null | null | +-------------------------------+----------------+---------------+-------------+ 1 row selected (0.012 seconds) 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:50:00'; +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DATA_IDX [~1,677,160,200,000] - [~1,677,159,000,000] | null | null | null | | SERVER FILTER BY FIRST KEY ONLY | null | null | null | +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ 2 rows selected (0.009 seconds) 0: jdbc:phoenix:localhost:59231> explain select /*+NO_INDEX*/ id, ts from data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00'; +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS | +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER DATA | null | null | null | | SERVER FILTER BY (TS >= TIMESTAMP '2023-02-23 13:30:00.000' AND TS < TIMESTAMP '2023-02-23 13:40:00.000') | null | null | null | +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ {noformat} Actually, the problem is much more basic: {noformat} create table ascpk (k varchar primary key); create table descpk (k varchar primary key desc); upsert into ascpk values ('a'); upsert into ascpk values ('aa'); upsert into ascpk values ('aaa'); upsert into ascpk values ('aaab'); //Same for descpk select * from ascpk; +------+ | K | +------+ | a | | aa | | aaa | | aaab | +------+ 4 rows selected (0.035 seconds) select * from descpk; +------+ | K | +------+ | aaab | | aaa | | aa | | a | +------+ select * from ascpk where k between 'a' and 'aaa'; +-----+ | K | +-----+ | a | | aa | | aaa | +-----+ 3 rows selected (0.026 seconds) select * from descpk where k between 'a' and 'aaa'; +---+ | K | +---+ +---+ No rows selected (0.022 seconds) {noformat} The inversion logic for descending keys completely breaks down when we try to use it for keyranges where end key is a prefix of the start key. > Cannot handle ranges where start is a prefix of end for desc columns > -------------------------------------------------------------------- > > Key: PHOENIX-6916 > URL: https://issues.apache.org/jira/browse/PHOENIX-6916 > Project: Phoenix > Issue Type: Bug > Reporter: Istvan Toth > Assignee: Istvan Toth > Priority: Critical > > For some non-empty timestamp ranges, Phoenix generates a query plan with a > degenerate scan if DESC index is created on the timestamp field: > {noformat} > create table data (id varchar primary key, ts timestamp); > create index data_idx on data(ts desc); > 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP > '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:40:00'; > +-------------------------------+----------------+---------------+-------------+ > | PLAN | EST_BYTES_READ | EST_ROWS_READ | > EST_INFO_TS | > +-------------------------------+----------------+---------------+-------------+ > | DEGENERATE SCAN OVER DATA_IDX | null | null | null > | > +-------------------------------+----------------+---------------+-------------+ > 1 row selected (0.012 seconds) > 0: jdbc:phoenix:> explain select id, ts from data where ts >= TIMESTAMP > '2023-02-23 13:30:00' and ts < TIMESTAMP '2023-02-23 13:50:00'; > +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | > EST_INFO_TS | > +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER DATA_IDX > [~1,677,160,200,000] - [~1,677,159,000,000] | null | null > | null | > | SERVER FILTER BY FIRST KEY ONLY > | null | null | null > | > +----------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > 2 rows selected (0.009 seconds) > 0: jdbc:phoenix:localhost:59231> explain select /*+NO_INDEX*/ id, ts from > data where ts >= TIMESTAMP '2023-02-23 13:30:00' and ts < TIMESTAMP > '2023-02-23 13:40:00'; > +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > | PLAN > | EST_BYTES_READ | EST_ROWS_READ | > EST_INFO_TS | > +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > | CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN FULL SCAN OVER DATA > | null | null | null > | > | SERVER FILTER BY (TS >= TIMESTAMP '2023-02-23 13:30:00.000' AND TS < > TIMESTAMP '2023-02-23 13:40:00.000') | null | null | null > | > +---------------------------------------------------------------------------------------------------------------+----------------+---------------+-------------+ > {noformat} > Actually, the problem is much more basic: > {noformat} > create table ascpk (k varchar primary key); > create table descpk (k varchar primary key desc); > upsert into ascpk values ('a'); > upsert into ascpk values ('aa'); > upsert into ascpk values ('aaa'); > upsert into ascpk values ('aaab'); > //Same for descpk > select * from ascpk; > +------+ > | K | > +------+ > | a | > | aa | > | aaa | > | aaab | > +------+ > 4 rows selected (0.035 seconds) > select * from descpk; > +------+ > | K | > +------+ > | aaab | > | aaa | > | aa | > | a | > +------+ > select * from ascpk where k between 'a' and 'aaa'; > +-----+ > | K | > +-----+ > | a | > | aa | > | aaa | > +-----+ > 3 rows selected (0.026 seconds) select * from descpk where k between 'a' and > 'aaa'; > +---+ > | K | > +---+ > +---+ > No rows selected (0.022 seconds) {noformat} > The inversion logic for descending keys completely breaks down when we try to > use it for keyranges where start key is a prefix of the end key. -- This message was sent by Atlassian Jira (v8.20.10#820010)