usberkeley commented on code in PR #11359:
URL: https://github.com/apache/hudi/pull/11359#discussion_r1620734871
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/OptionsResolver.java:
##########
@@ -370,7 +370,7 @@ public static ConflictResolutionStrategy
getConflictResolutionStrategy(Configura
* Returns whether to commit even when current batch has no data, for flink
defaults false
*/
public static boolean allowCommitOnEmptyBatch(Configuration conf) {
- return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(), false);
+ return conf.getBoolean(HoodieWriteConfig.ALLOW_EMPTY_COMMIT.key(),
HoodieWriteConfig.ALLOW_EMPTY_COMMIT.defaultValue());
Review Comment:
The original TestHoodieFlinkQuickstart can run successfully because:
After correcting the default return value of
OptionsResolver#allowCommitOnEmptyBatch to "true",
StreamWriteOperatorCoordinator will submit an empty Commit Or DeltaCommit (when
Checkpoint is completed), so when the program queries the latest commit, the
commit is empty, and the returned result is also empty, and the unit test fails.
Modification plan:
When creating a Hudi table, set hoodie.allow.empty.commit = false
Other solutions:
You can modify the default value, but I personally think it is not good
enough. The reason is:
Modify hoodie.allow.empty.commit to false, but the default value of the
official document and code is true, and it is very important to submit an empty
commit by default in Flink, which can track the entire life cycle. Therefore,
do not adopt this solution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]