[
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065626#comment-16065626
]
Eugene Kirpichov commented on BEAM-2140:
----------------------------------------
_we can't advance the watermark as though it was non-splittable in the
unbounded case_ - why is that / why is it a bad thing that the watermark of the
PCollection being fed into the SDF would not advance? E.g. imagine it's a
Create.of(pubsub topic name) + ParDo(read pubsub forever) - is it important to
advance the watermark of the Create.of()?
Alternatively, imagine it's: read filepatterns from pubsub +
TextIO.readAll().watchForNewFiles().watchFilesForNewEntries(), which has
several SDFs in this. Would there be a problem with advancing the watermark of
the PCollection of filepatterns only after the watch termination conditions of
TextIO.readAll() are hit and this filepattern is no longer watched?
Alternatively - worst case I guess: read Pubsub topic names from Kafka, and
read each topic forever. I'd assume that the user would be interested in
advancement of the watermark of the PCollection of pubsub records rather than
the PCollection of Pubsub topic names? I'm not sure the Pubsub topic names in
Kafka would even need to have meaningful timestamps (rather than infinite past).
> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> -------------------------------------------------------
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
> Issue Type: Bug
> Components: runner-flink
> Reporter: Aljoscha Krettek
> Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled
> the tests to unblock the open PR for BEAM-1763.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)