nsivabalan commented on code in PR #10915:
URL: https://github.com/apache/hudi/pull/10915#discussion_r1568072805
##########
hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java:
##########
@@ -114,6 +114,24 @@ public Map<HoodieFileGroupId, List<HoodieCDCFileSplit>>
extractCDCFileSplits() {
ValidationUtils.checkState(commits != null, "Empty commits");
Map<HoodieFileGroupId, List<HoodieCDCFileSplit>> fgToCommitChanges = new
HashMap<>();
+
Review Comment:
So, I tried making a fix.
first collect all replacedFileIds. and then ignore those while parsing other
commit metadata.
I was able to get past the parsing issue. but assertions for record counts
are failing. probably anyone who has worked on CDC might be better to fix this.
Test case of interest:
TestCDCDataFrameSuite.testMORDataSourceWrite
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]