[
https://issues.apache.org/jira/browse/HUDI-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Y Ethan Guo updated HUDI-8780:
------------------------------
Fix Version/s: 1.1.0
> RFC-83 Incremental Table Service
> --------------------------------
>
> Key: HUDI-8780
> URL: https://issues.apache.org/jira/browse/HUDI-8780
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Yue Zhang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.1.0
>
>
> In Hudi, when scheduling Compaction and Clustering, the default behavior is
> to scan all partitions under the current table. When there are many
> historical partitions, such as 640,000 in our production environment, this
> scanning and planning operation becomes very inefficient. For Flink, it often
> leads to checkpoint timeouts, resulting in data delays.
> As for cleaning, we already have the ability to do cleaning for incremental
> partitions.
> This RFC will draw on the design of Incremental Clean to generalize the
> capability of processing incremental partitions to all table services, such
> as Clustering and Compaction.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)