Viraj Jasani created PHOENIX-7489:
-------------------------------------
Summary: Add all partition ids internally if CDC query only
includes timestamp range
Key: PHOENIX-7489
URL: https://issues.apache.org/jira/browse/PHOENIX-7489
Project: Phoenix
Issue Type: Sub-task
Reporter: Viraj Jasani
Since PHOENIX-7425 introduced partitioned CDC Index to eliminate salting, it is
important to include PARTITION_ID() in addition to PHOENIX_ROW_TIMESTAMP() with
the WHERE clause of the CDC query. Before PHOENIX-7425, providing only
PHOENIX_ROW_TIMESTAMP() was sufficient as it was the rowkey prefix of the CDC
Index table. However, that is not the case anymore.
If the user only provides PHOENIX_ROW_TIMESTAMP() with the WHERE clause, it
would result into the full table scan over the CDC Index. By providing both
PARTITION_ID() and PHOENIX_ROW_TIMESTAMP(), it results into the range scan.
Not all the clients might be aware of all unique partition ids present in the
CDC Index. Hence, even if a client only provides the timestamp range with the
CDC query, the list of partition ids should be internally retrieved and used
alongside the timestamp range for the efficient range scan performance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)