Nicholas Jiang created HUDI-6158:
------------------------------------
Summary: Strengthen Flink clustering commit and rollback strategy
Key: HUDI-6158
URL: https://issues.apache.org/jira/browse/HUDI-6158
Project: Apache Hudi
Issue Type: Improvement
Components: flink
Reporter: Nicholas Jiang
Assignee: Nicholas Jiang
Fix For: 0.14.0
`ClusteringCommitSink` could strengthen commit and rollback strategy from two
solutions:
* Commit: Introduces `clusteringPlanCache` that caches to store clustering
plan for each instant. `clusteringPlanCache` stores the mapping of instant_time
-> clusteringPlan.
* Rolback: Updates `commitBuffer` that stores the mapping of instant_time ->
file_ids -> event. Use a map to collect the events because the rolling back of
intermediate clustering tasks generates corrupt events.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)