[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064561#comment-16064561
 ] 

Aljoscha Krettek commented on BEAM-2140:
----------------------------------------

To quickly summarise what's happening: The input watermark of the ProcessFn 
goes to {{+Inf}}, the processing-time timer is dropped as late and we don't 
process some data because of this. 

 * [~jkff] is operating under the assumption that a watermark hold holds the 
input watermark and therefore should keep the ProcessFn from being shut down 
until the processing-time timer fires
 * [~kenn] is operating under the assumption that a watermark holds the output 
watermark. An incoming +Inf watermark therefore causes an operation to shutdown 
(because future timers and events can be considered late and dropped). 

The splittable DoFn implementation in Flink (seemingly) works, except in this 
edge case. It seems we have to re-consider what is considered late and how 
watermark holds work to resolve this. What do you think?

The proposal of registering timers with an output watermark hold is essentially 
what the ProcessFn is doing manually, i.e. it's registering a timer and adding 
a watermark hold.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> -------------------------------------------------------
>
>                 Key: BEAM-2140
>                 URL: https://issues.apache.org/jira/browse/BEAM-2140
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to