hbgstc123 commented on code in PR #8568:
URL: https://github.com/apache/hudi/pull/8568#discussion_r1177699120


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringCommitSink.java:
##########
@@ -179,7 +179,7 @@ private void doCommit(String instant, HoodieClusteringPlan 
clusteringPlan, List<
         TableServiceType.CLUSTER, writeMetadata.getCommitMetadata().get(), 
table, instant);
 
     // whether to clean up the input base parquet files used for clustering
-    if (!conf.getBoolean(FlinkOptions.CLEAN_ASYNC_ENABLED)) {
+    if (!conf.getBoolean(FlinkOptions.CLEAN_ASYNC_ENABLED) && !isCleaning) {
       LOG.info("Running inline clean");

Review Comment:
   Multiple cleaning tasks maybe running the same clean instant, it's 
unnecessary.
   And now multi clean can leave `.hoodie/20230425173418352.clean.tmp` this tmp 
file in timeline because the slower clean task will fail to rename this tmp 
file to the final `20230425173418352.clean` because it will be there created by 
an earlier clean task.
   We can try fix the rename issue in `HoodieActiveTimeline.transitionState` 
too, delete the tmp file when rename is unsuccessful.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to