[ https://issues.apache.org/jira/browse/PHOENIX-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani reassigned PHOENIX-7489: ------------------------------------- Assignee: (was: Viraj Jasani) > Add all partition ids internally to optimize full CDC Index scan queries > ------------------------------------------------------------------------ > > Key: PHOENIX-7489 > URL: https://issues.apache.org/jira/browse/PHOENIX-7489 > Project: Phoenix > Issue Type: Sub-task > Reporter: Viraj Jasani > Priority: Major > > Since PHOENIX-7425 introduced partitioned CDC Index to eliminate salting, it > is important to include PARTITION_ID() in addition to PHOENIX_ROW_TIMESTAMP() > with the WHERE clause of the CDC query. Before PHOENIX-7425, providing only > PHOENIX_ROW_TIMESTAMP() was sufficient as it was the rowkey prefix of the CDC > Index table. However, that is not the case anymore. > If the user only provides PHOENIX_ROW_TIMESTAMP() with the WHERE clause, it > would result into the full table scan over the CDC Index. By providing both > PARTITION_ID() and PHOENIX_ROW_TIMESTAMP(), it results into the range scan. > Not all the clients might be aware of all unique partition ids present in the > CDC Index. Hence, even if a client only provides the timestamp range with the > CDC query, the list of partition ids should be internally retrieved and used > alongside the timestamp range for the efficient range scan performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)