usberkeley commented on code in PR #11359:
URL: https://github.com/apache/hudi/pull/11359#discussion_r1620794903


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/OptionsResolver.java:
##########
@@ -370,7 +370,7 @@ public static ConflictResolutionStrategy 
getConflictResolutionStrategy(Configura
    * Returns whether to commit even when current batch has no data, for flink 
defaults false
    */
   public static boolean allowCommitOnEmptyBatch(Configuration conf) {
-    return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(), false);
+    return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(), 
HoodieWriteConfig.ALLOW_EMPTY_COMMIT.defaultValue());

Review Comment:
   After correcting the default return value of 
OptionsResolver#allowCommitOnEmptyBatch to "true", 
StreamWriteOperatorCoordinator will submit an empty Commit Or DeltaCommit (when 
Checkpoint is completed), so when the program queries the latest commit, the 
commit is empty, so the returned result is also empty, and the unit test fails 
in the end
   
   Modification plan:
   When creating a Hudi table, set hoodie.allow.empty.commit = false
   
   Other solutions:
   We can modify the default value of the "hoodie.allow.empty.commit", but I 
personally think it is not good enough, the reason:
   Modify "hoodie.allow.empty.commit" default value to "false", but the default 
value of the official document and code is "true", and it is very important to 
submit an empty commit by default in Flink, which can track the entire life 
cycle. Therefore, do not adopt this solution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to