[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065626#comment-16065626
 ] 

Eugene Kirpichov commented on BEAM-2140:
----------------------------------------

_we can't advance the watermark as though it was non-splittable in the 
unbounded case_ - why is that / why is it a bad thing that the watermark of the 
PCollection being fed into the SDF would not advance? E.g. imagine it's a 
Create.of(pubsub topic name) + ParDo(read pubsub forever) - is it important to 
advance the watermark of the Create.of()?

Alternatively, imagine it's: read filepatterns from pubsub + 
TextIO.readAll().watchForNewFiles().watchFilesForNewEntries(), which has 
several SDFs in this. Would there be a problem with advancing the watermark of 
the PCollection of filepatterns only after the watch termination conditions of 
TextIO.readAll() are hit and this filepattern is no longer watched?

Alternatively - worst case I guess: read Pubsub topic names from Kafka, and 
read each topic forever. I'd assume that the user would be interested in 
advancement of the watermark of the PCollection of pubsub records rather than 
the PCollection of Pubsub topic names? I'm not sure the Pubsub topic names in 
Kafka would even need to have meaningful timestamps (rather than infinite past).

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> -------------------------------------------------------
>
>                 Key: BEAM-2140
>                 URL: https://issues.apache.org/jira/browse/BEAM-2140
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to