[ 
https://issues.apache.org/jira/browse/HUDI-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Linleicheng updated HUDI-6892:
------------------------------
    Affects Version/s:     (was: 0.14.0)
             Priority: Critical  (was: Major)

> ExternalSpillableMap may cause data duplication when flink compaction
> ---------------------------------------------------------------------
>
>                 Key: HUDI-6892
>                 URL: https://issues.apache.org/jira/browse/HUDI-6892
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Linleicheng
>            Priority: Critical
>              Labels: pull-request-available
>
> reproduce:
> 1、fullfill in-memory map with records, and let this.inMemoryMap.size() % 
> NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0
> 2、insert a record with key1 into ExternalSpillableMap (which will cause size 
> estimate and make sure the currentInMemoryMapSize is still greater than or 
> equal to the maxInMemorySizeInBytes).
>    it will be spilled to disk. 
> 3、Reduce the size of record of key1 which will make the 
> currentInMemoryMapSize less than maxInMemorySizeInBytes when put into 
> ExternalSpillableMap
>    it will be put into in-memory map.
>    
> data duplication when iterator finally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to