[
https://issues.apache.org/jira/browse/FLINK-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jark Wu updated FLINK-17099:
----------------------------
Fix Version/s: (was: 1.12.0)
1.13.0
> Refactoring State TTL solution in Group Agg、Deduplication、TopN operators
> replace Timer with StateTtlConfig
> ----------------------------------------------------------------------------------------------------------
>
> Key: FLINK-17099
> URL: https://issues.apache.org/jira/browse/FLINK-17099
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Runtime
> Affects Versions: 1.9.0, 1.10.0
> Reporter: dalongliu
> Assignee: dalongliu
> Priority: Major
> Fix For: 1.13.0
>
>
> At the moment, there are 2 ways to cleanup states.
> 1) registering a processing-time timer, and cleanup entries when the timer is
> callback.
> - pros: can cleanup multiple states at the same time (state consistent)
> - cons: timer space depends on the key size, which may lead to OOM (heap
> timer).
> - used in Group Aggregation, Over Aggregation, TopN
> 2) using the {{StateTtlConfig}} provided by DataStream [1].
> - pros: decouple the logic of state ttl with the record processing, easy to
> program (take a look at old planner NonWindowJoin which bundles ttl timestamp
> with records in MapState).
> - cons: can't cleanup multiple states at the same time.
> - useed in Sream-Stream Joins.
> For timer solution, although it can cleanup multiple states at the same time,
> but it also will lead to OOM when there have a great many state keys,
> besides, StateTtlConfig is used in stream-stream join case, and will be used
> in more operator. Therefore,in order to unify the state ttl solution,
> simplify the code implemention, and improve the readability of codes, so we
> should refactor state cleanup way which use StateTtlConfig to replace
> processing-time timer in Group Aggregation、Deduplication、TopN operators.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)