[
https://issues.apache.org/jira/browse/FLINK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003433#comment-17003433
]
Jark Wu commented on FLINK-15393:
---------------------------------
Hi [~hailong wang], I think we should use {{ReturnExpiredIfNotCleanedUp}} here.
Because state TTL in regular join is used for preventing unlimited state size
at the cost of losing accuracy. The TTL is not a semantic, the more we clean
up, the result will be less accuracy. That means, we should join the expired
data if it is not cleaned up.
Regarding for temporal join, temporal join can clean up state on time, so that
it doesn't need state ttl (the community don't have an agreement to prevent the
state size of temporal table state.).
> Change 'ReturnExpiredIfNotCleanedUp' to 'NeverReturnExpired' in
> JoinRecordStateViews#createTtlConfig
> ------------------------------------------------------------------------------------------------------
>
> Key: FLINK-15393
> URL: https://issues.apache.org/jira/browse/FLINK-15393
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Runtime
> Affects Versions: 1.10.0
> Reporter: hailong wang
> Priority: Major
> Fix For: 1.11.0
>
>
> In Blink planner, we use ttl state to clean expired data by rocksdb for
> regular joins. The StateTtlConfig is:
> {code:java}
> StateTtlConfig
> .newBuilder(Time.milliseconds(retentionTime))
> .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
>
> .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp)
> .build();
> {code}
> I think we should use StateTtlConfig.StateVisibility.NeverReturnExpired to
> prevent to join expired data.
> BTW, Only regular joins use rocksdb filter to clean expired data, should we
> change timer to rocksdb filter in other sql operator such as
> TemporalTableJoin?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)