vinothchandar commented on a change in pull request #2296:
URL: https://github.com/apache/hudi/pull/2296#discussion_r570476603
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
##########
@@ -173,6 +173,10 @@ public boolean commitStats(String instantTime,
List<HoodieWriteStat> stats, Opti
public boolean commitStats(String instantTime, List<HoodieWriteStat> stats,
Option<Map<String, String>> extraMetadata,
String commitActionType, Map<String,
List<String>> partitionToReplaceFileIds) {
+ // Skip the empty commit
+ if (stats.isEmpty()) {
Review comment:
I think there was an explicit ask to allow the empty commit before. Lets
take deltastreamer which stores the offset of the kafka checkpoints in the
commit metadata. If we don't commit when stats are empty the checkpoint will
never advance. The transformer in delta streamer could filter out all records
read in that batch for e.g and lead to an empty commit. but the kafka offsets
would have advanced. So its not good to do this IMO
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]