[
https://issues.apache.org/jira/browse/FLINK-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959556#comment-16959556
]
Yu Li commented on FLINK-12692:
-------------------------------
[~wind_ljy] Sorry for the late response, just noticed. We will try best to
deliver this feature in 1.10 release. Btw, do you plan to try this out in your
product environment?
> Support disk spilling in HeapKeyedStateBackend
> ----------------------------------------------
>
> Key: FLINK-12692
> URL: https://issues.apache.org/jira/browse/FLINK-12692
> Project: Flink
> Issue Type: New Feature
> Components: Runtime / State Backends
> Reporter: Yu Li
> Assignee: Yu Li
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {{HeapKeyedStateBackend}} is one of the two {{KeyedStateBackends}} in Flink,
> since state lives as Java objects on the heap and the de/serialization only
> happens during state snapshot and restore, it outperforms
> {{RocksDBKeyedStateBackend}} when all data could reside in memory.
> However, along with the advantage, {{HeapKeyedStateBackend}} also has its
> shortcomings, and the most painful one is the difficulty to estimate the
> maximum heap size (Xmx) to set, and we will suffer from GC impact once the
> heap memory is not enough to hold all state data. There’re several
> (inevitable) causes for such scenario, including (but not limited to):
> * Memory overhead of Java object representation (tens of times of the
> serialized data size).
> * Data flood caused by burst traffic.
> * Data accumulation caused by source malfunction.
> To resolve this problem, we propose a solution to support spilling state data
> to disk before heap memory is exhausted. We will monitor the heap usage and
> choose the coldest data to spill, and reload them when heap memory is
> regained after data removing or TTL expiration, automatically.
> More details please refer to the design doc and mailing list discussion.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)