Viraj Jasani created PHOENIX-7489: ------------------------------------- Summary: Add all partition ids internally if CDC query only includes timestamp range Key: PHOENIX-7489 URL: https://issues.apache.org/jira/browse/PHOENIX-7489 Project: Phoenix Issue Type: Sub-task Reporter: Viraj Jasani
Since PHOENIX-7425 introduced partitioned CDC Index to eliminate salting, it is important to include PARTITION_ID() in addition to PHOENIX_ROW_TIMESTAMP() with the WHERE clause of the CDC query. Before PHOENIX-7425, providing only PHOENIX_ROW_TIMESTAMP() was sufficient as it was the rowkey prefix of the CDC Index table. However, that is not the case anymore. If the user only provides PHOENIX_ROW_TIMESTAMP() with the WHERE clause, it would result into the full table scan over the CDC Index. By providing both PARTITION_ID() and PHOENIX_ROW_TIMESTAMP(), it results into the range scan. Not all the clients might be aware of all unique partition ids present in the CDC Index. Hence, even if a client only provides the timestamp range with the CDC query, the list of partition ids should be internally retrieved and used alongside the timestamp range for the efficient range scan performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)