[ 
https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346026#comment-17346026
 ] 

Andy commented on FLINK-21301:
------------------------------

Hi [~jark] Thanks for your reply.

I agree that things will be simpler if a fine-grained state TTL that different 
operators can have different TTL.

However, IMO, allow lateness of window operator has relationship with TTL, but 
has a little difference. If enable allow lateness, the behavior of state ttl on 
window operator would be effected. But set state ttl could effect the behavior 
of allow lateness on window operator? I means allow lateness not only effects 
state ttl but also trigger behavior and emit behavior. It's better to introduce 
new configuration or new syntax to handle allow lateness.

What do you think?

> Decouple window aggregate allow lateness with state ttl configuration
> ---------------------------------------------------------------------
>
>                 Key: FLINK-21301
>                 URL: https://issues.apache.org/jira/browse/FLINK-21301
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>            Reporter: Andy
>            Priority: Major
>              Labels: auto-unassigned
>             Fix For: 1.14.0
>
>
> Currently, state retention time config will also effect state clean behavior 
> of Window Aggregate, which is unexpected for most users.
> E.g for the following example,  User would set `MinIdleStateRetentionTime` to 
> 1 Day to clean state in `deduplicate` . However, it will also effects clean 
> behavior of window aggregate. For example, 2021-01-04 data would clean at 
> 2021-01-06 instead of 2021-01-05. 
> {code:sql}
> SELECT
>  DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'yyyy-MM-dd') as stat_time,
>  count(1) first_phone_num
> FROM (
>  SELECT 
>  ROWTIME,
>  user_id,
>  row_number() over(partition by user_id, pdate order by ROWTIME ) as rn
>  FROM source_kafka_biz_shuidi_sdb_crm_call_record 
> ) cal 
> where rn =1
> group by tumble(ROWTIME,interval '1' DAY);{code}
> It's better to decouple window aggregate allow lateness with 
> `MinIdleStateRetentionTime` .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to