virajjasani opened a new pull request, #2070: URL: https://github.com/apache/phoenix/pull/2070
Jira: PHOENIX-7489 Since [PHOENIX-7425](https://issues.apache.org/jira/browse/PHOENIX-7425) introduced partitioned CDC Index to eliminate salting, it is important to include PARTITION_ID() in addition to PHOENIX_ROW_TIMESTAMP() with the WHERE clause of the CDC query. Before [PHOENIX-7425](https://issues.apache.org/jira/browse/PHOENIX-7425), providing only PHOENIX_ROW_TIMESTAMP() was sufficient as it was the rowkey prefix of the CDC Index table. However, that is not the case anymore. If the user only provides PHOENIX_ROW_TIMESTAMP() with the WHERE clause, it would result into the full table scan over the CDC Index. By providing both PARTITION_ID() and PHOENIX_ROW_TIMESTAMP(), it results into the range scan. Not all the clients might be aware of all unique partition ids present in the CDC Index. Hence, even if a client only provides the timestamp range with the CDC query, the list of partition ids should be internally retrieved and used alongside the timestamp range for the efficient range scan performance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org