Sagar Sumit created HUDI-2496:
---------------------------------
Summary: Inserts are precombined even with dedup disabled
Key: HUDI-2496
URL: https://issues.apache.org/jira/browse/HUDI-2496
Project: Apache Hudi
Issue Type: Bug
Reporter: Sagar Sumit
Test case by [~xushiyan] : https://github.com/apache/hudi/pull/3723/files
RCA by [~shivnarayan] :
Within HoodieMergeHandle, we use a hashmap to store incoming records, where
keys are record keys.
and so, if you see 1st batch, duplicates would remain intact. but wrt 2nd
batch, only unique records are considered and later concatenated w/ 1st batch.
https://github.com/apache/hudi/blob/36be28712196ff4427c41b0aa885c7fcd7356d7f/hudi-[…]-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java
--
This message was sent by Atlassian Jira
(v8.3.4#803005)