[
https://issues.apache.org/jira/browse/HUDI-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
satish updated HUDI-5318:
-------------------------
Fix Version/s: 0.13.0
(was: 0.12.2)
> Clustering schduling now will list all partition in table when
> PARTITION_SELECTED is set
> ----------------------------------------------------------------------------------------
>
> Key: HUDI-5318
> URL: https://issues.apache.org/jira/browse/HUDI-5318
> Project: Apache Hudi
> Issue Type: Bug
> Components: clustering
> Reporter: Qijun Fu
> Assignee: Qijun Fu
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.13.0
>
>
> Currently PartitionAwareClusteringPlanStrategy will list all partition in
> table whether PARTITION_SELECTED is set or not. List all partition in the
> dataset is a very expensive operation when the number of partition is huge.
> We can skip list all partition when PARTITION_SELECTED is set, so that
> clustering scheduling can benefit a lot fromĀ partition pruning.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)