zhuanshenbsj1 commented on code in PR #9878:
URL: https://github.com/apache/hudi/pull/9878#discussion_r1363170585
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSink.java:
##########
@@ -106,17 +106,26 @@ public SinkRuntimeProvider getSinkRuntimeProvider(Context
context) {
// bootstrap
final DataStream<HoodieRecord> hoodieRecordDataStream =
Pipelines.bootstrap(conf, rowType, dataStream, context.isBounded(),
overwrite);
+
// write pipeline
pipeline = Pipelines.hoodieStreamWrite(conf, hoodieRecordDataStream);
- // compaction
+
+ // insert cluster mode
+ if (OptionsResolver.isInsertClusterMode(conf)) {
+ return Pipelines.clean(conf, pipeline);
+ }
+
+ // upsert mode
if (OptionsResolver.needsAsyncCompaction(conf)) {
// use synchronous compaction for bounded source.
if (context.isBounded()) {
conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
}
return Pipelines.compact(conf, pipeline);
- } else {
+ } else if (OptionsResolver.isLazyFailedWritesCleanPolicy(conf)) {
return Pipelines.clean(conf, pipeline);
+ } else {
+ return Pipelines.dummySink(pipeline);
Review Comment:
Similar to clustering, cleaning is performed wherever merging is
performed(inline or offline).
```
if (OptionsResolver.needsAsyncCompaction(conf)) { // 1
// use synchronous compaction for bounded source.
if (context.isBounded()) {
conf.setBoolean(FlinkOptions.COMPACTION_ASYNC_ENABLED, false);
}
return Pipelines.compact(conf, pipeline);
} else if (OptionsResolver.isLazyFailedWritesCleanPolicy(conf)) { //2.1
return Pipelines.clean(conf, pipeline);
} else { //2.2
return Pipelines.dummySink(pipeline);
}
```
1. If flink online asynchronous merge execute is turned on,
cluster/compactor commit operator will do clean.
2. If flink online asynchronous merge execute is turned off, there are two
situations
2.1 To enable lazy cleaning, it is necessary to add the clean operator for
rollback.
2.2 To disable lazy cleaning, there is no need to consider rollback. Clean
will be called after offline task execution is completed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]