[jira] [Updated] (FLINK-17099) Refactoring State TTL solution in Group Agg、Deduplication、TopN operators replace Timer with StateTtlConfig

Jark Wu (Jira) Mon, 16 Nov 2020 05:42:22 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jark Wu updated FLINK-17099:
----------------------------
    Fix Version/s:     (was: 1.12.0)
                   1.13.0

> Refactoring State TTL solution in Group Agg、Deduplication、TopN operators 
> replace Timer with StateTtlConfig
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-17099
>                 URL: https://issues.apache.org/jira/browse/FLINK-17099
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Runtime
>    Affects Versions: 1.9.0, 1.10.0
>            Reporter: dalongliu
>            Assignee: dalongliu
>            Priority: Major
>             Fix For: 1.13.0
>
>
> At the moment, there are 2 ways to cleanup states.
> 1) registering a processing-time timer, and cleanup entries when the timer is 
> callback.
>  - pros: can cleanup multiple states at the same time (state consistent)
>  - cons: timer space depends on the key size, which may lead to OOM (heap 
> timer).
>  - used in Group Aggregation, Over Aggregation, TopN
> 2) using the {{StateTtlConfig}} provided by DataStream [1].
>  - pros: decouple the logic of state ttl with the record processing, easy to 
> program (take a look at old planner NonWindowJoin which bundles ttl timestamp 
> with records in MapState).
>  - cons: can't cleanup multiple states at the same time.
>  - useed in Sream-Stream Joins.
> For timer solution, although it can cleanup multiple states at the same time, 
> but it also will lead to OOM when there have a great many state keys, 
> besides, StateTtlConfig is used in stream-stream join case, and will be used 
> in more operator. Therefore，in order to unify the state ttl solution, 
> simplify the code implemention, and improve the readability of codes, so we 
> should refactor state cleanup way which use StateTtlConfig to replace 
> processing-time timer in Group Aggregation、Deduplication、TopN operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17099) Refactoring State TTL solution in Group Agg、Deduplication、TopN operators replace Timer with StateTtlConfig

Reply via email to