usberkeley commented on code in PR #11359:
URL: https://github.com/apache/hudi/pull/11359#discussion_r1620794903
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/OptionsResolver.java:
##########
@@ -370,7 +370,7 @@ public static ConflictResolutionStrategy
getConflictResolutionStrategy(Configura
* Returns whether to commit even when current batch has no data, for flink
defaults false
*/
public static boolean allowCommitOnEmptyBatch(Configuration conf) {
- return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(), false);
+ return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(),
HoodieWriteConfig.ALLOW_EMPTY_COMMIT.defaultValue());
Review Comment:
After correcting the default return value of
OptionsResolver#allowCommitOnEmptyBatch to "true",
StreamWriteOperatorCoordinator will submit an empty Commit Or DeltaCommit (when
Checkpoint is completed), so when the program queries the latest commit, the
commit is empty, so the returned result is also empty, and the unit test fails
in the end
Modification plan:
When creating a Hudi table, set hoodie.allow.empty.commit = false
Other solutions:
We can modify the default value of the "hoodie.allow.empty.commit", but I
personally think it is not good enough, the reason:
Modify "hoodie.allow.empty.commit" default value to "false", but the default
value of the official document and code is "true", and it is very important to
submit an empty commit by default in Flink, which can track the entire life
cycle. Therefore, do not adopt this solution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]