zhuanshenbsj1 commented on code in PR #9878:
URL: https://github.com/apache/hudi/pull/9878#discussion_r1363170585


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSink.java:
##########
@@ -106,17 +106,26 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context 
context) {
       // bootstrap
       final DataStream<HoodieRecord> hoodieRecordDataStream =
           Pipelines.bootstrap(conf, rowType, dataStream, context.isBounded(), 
overwrite);
+
       // write pipeline
       pipeline = Pipelines.hoodieStreamWrite(conf, hoodieRecordDataStream);
-      // compaction
+
+      // insert cluster mode
+      if (OptionsResolver.isInsertClusterMode(conf)) {
+        return Pipelines.clean(conf, pipeline);
+      }
+
+      // upsert mode
       if (OptionsResolver.needsAsyncCompaction(conf)) {
         // use synchronous compaction for bounded source.
         if (context.isBounded()) {
           conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
         }
         return Pipelines.compact(conf, pipeline);
-      } else {
+      } else if (OptionsResolver.isLazyFailedWritesCleanPolicy(conf)) {
         return Pipelines.clean(conf, pipeline);
+      } else {
+        return Pipelines.dummySink(pipeline);

Review Comment:
   Similar to clustering, cleaning is performed wherever merging is 
performed(inline or offline). 
   
   ```
         if (OptionsResolver.needsAsyncCompaction(conf)) { // 1
           // use synchronous compaction for bounded source.
           if (context.isBounded()) {
             conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
           }
           return Pipelines.compact(conf, pipeline);
         } else if (OptionsResolver.isLazyFailedWritesCleanPolicy(conf)) { //2.1
           return Pipelines.clean(conf, pipeline);
         } else { //2.2
           return Pipelines.dummySink(pipeline);
         }
   
   ```
   
   1. If flink online asynchronous merge  execute is turned on, 
cluster/compactor commit operator will do clean.
   2. If flink online asynchronous merge  execute is turned off, there are two 
situations
    2.1 To enable lazy cleaning, it is necessary to add the clean operator for 
rollback.
    2.2 To disable lazy cleaning, there is no need to consider rollback. Clean 
will be called after offline task execution is completed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to