[ 
https://issues.apache.org/jira/browse/HUDI-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Y Ethan Guo updated HUDI-8780:
------------------------------
    Fix Version/s: 1.1.0

> RFC-83 Incremental Table Service
> --------------------------------
>
>                 Key: HUDI-8780
>                 URL: https://issues.apache.org/jira/browse/HUDI-8780
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Yue Zhang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.1.0
>
>
> In Hudi, when scheduling Compaction and Clustering, the default behavior is 
> to scan all partitions under the current table. When there are many 
> historical partitions, such as 640,000 in our production environment, this 
> scanning and planning operation becomes very inefficient. For Flink, it often 
> leads to checkpoint timeouts, resulting in data delays. 
> As for cleaning, we already have the ability to do cleaning for incremental 
> partitions.
> This RFC will draw on the design of Incremental Clean to generalize the 
> capability of processing incremental partitions to all table services, such 
> as Clustering and Compaction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to