[ 
https://issues.apache.org/jira/browse/BEAM-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959226#comment-15959226
 ] 

Jingsong Lee commented on BEAM-1723:
------------------------------------

I see {{CachedIdDeduplicator}} in direct runner. It use {{LoadingCache}} to 
dedup. The expireAfterAccess is 10 minutes and the maximumSize is 100_000. Do 
these two values need to be parameterized?

Do these caches need be snapshotted in flink runner?  (Fault tolerance)

> FlinkRunner should deduplicate when an UnboundedSource requires Deduping
> ------------------------------------------------------------------------
>
>                 Key: BEAM-1723
>                 URL: https://issues.apache.org/jira/browse/BEAM-1723
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Thomas Groh
>
> UnboundedSource implementations can require deduping, and the FlinkRunner 
> currently logs a warning that this is not supported.
> https://github.com/apache/beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/UnboundedSourceWrapper.java#L139



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to