[
https://issues.apache.org/jira/browse/HUDI-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Linleicheng closed HUDI-6892.
-----------------------------
Fix Version/s: 1.0.0
Resolution: Fixed
> ExternalSpillableMap may cause data duplication when flink compaction
> ---------------------------------------------------------------------
>
> Key: HUDI-6892
> URL: https://issues.apache.org/jira/browse/HUDI-6892
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Linleicheng
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> reproduce:
> 1、fullfill in-memory map with records, and let this.inMemoryMap.size() %
> NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0
> 2、insert a record with key1 into ExternalSpillableMap (which will cause size
> estimate and make sure the currentInMemoryMapSize is still greater than or
> equal to the maxInMemorySizeInBytes).
> it will be spilled to disk.
> 3、Reduce the size of record of key1 which will make the
> currentInMemoryMapSize less than maxInMemorySizeInBytes when put into
> ExternalSpillableMap
> it will be put into in-memory map.
>
> data duplication when iterator finally.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)