[ 
https://issues.apache.org/jira/browse/BEAM-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890915#comment-15890915
 ] 

Eugene Kirpichov commented on BEAM-849:
---------------------------------------

Sorry, I don't understand the question. This use case - representing reading 
from a file, including a continually growing file, as a file-reading splittable 
ParDo applied to a PCollection of filenames - is one of the main motivating use 
cases of splittable DoFn https://s.apache.org/splittable-do-fn ; all I'm saying 
here is that, even if the file eventually stops growing forever and we would 
like the pipeline to terminate in that case, this makes sense regardless of 
whether the runner is a "batch" or a "streaming" runner. And a user may want to 
choose to run this pipeline using their favorite runner in "streaming" mode for 
various reasons; one being that a "streaming" runner might process the data 
from the file immediately as it appears, whereas a "batch" runner might think 
it's ok to wait for the whole file to appear and then process it. This is of 
course runner-specific; my main point is that there is no such thing as 
"unbounded pipelines" in the Beam model, and semantics of waitForFinish() 
should be defined in terms of the Beam model and treated equally by all runners 
regardless of whether the runner has a distinction between "batch/streaming" 
modes.

> Redesign PipelineResult API
> ---------------------------
>
>                 Key: BEAM-849
>                 URL: https://issues.apache.org/jira/browse/BEAM-849
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Pei He
>
> Current state: 
> Jira https://issues.apache.org/jira/browse/BEAM-443 addresses 
> waitUntilFinish() and cancel(). 
> However, there are additional work around PipelineResult: 
> need clearly defined contract and verification across all runners 
> need to revisit how to handle metrics/aggregators 
> need to be able to get logs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to