[jira] [Commented] (BEAM-849) Redesign PipelineResult API

Etienne Chauchot (JIRA) Wed, 10 May 2017 01:22:35 -0700

    [ 
https://issues.apache.org/jira/browse/BEAM-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004274#comment-16004274
 ]


Etienne Chauchot commented on BEAM-849:
---------------------------------------

 Here are the differences I have observed in streaming pipelines termination:

- direct runner: when the output watermarks of all of its PCollections progress 
to +infinity

- apex runner: when the output watermarks of all of its PCollections progress 
to +infinity

- dataflow runner: when the output watermarks of all of its PCollections 
progress to +infinity

- spark runner: streaming pipelines do not terminate unless timeout is set in 
pipelineResult.waitUntilFinish()

- flink runner: streaming pipelines do not terminate unless timeout is set in 
pipelineResult.waitUntilFinish() (thanks to Aljoscha for timeout support PR 
https://github.com/apache/beam/pull/2915#pullrequestreview-37090326)


=> Is the direct/apex/dataflow behavior the correct "beam model" behavior?


I know that, at least for spark (mails in this thread), there is no easy way to 
know that we're done reading a source, so it might be very difficult (at least 
for this runner) to unify toward +infinity behavior if it is chosen as the 
standard behavior.


> Redesign PipelineResult API
> ---------------------------
>
>                 Key: BEAM-849
>                 URL: https://issues.apache.org/jira/browse/BEAM-849
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Pei He
>
> Current state: 
> Jira https://issues.apache.org/jira/browse/BEAM-443 addresses 
> waitUntilFinish() and cancel(). 
> However, there are additional work around PipelineResult: 
> need clearly defined contract and verification across all runners 
> need to revisit how to handle metrics/aggregators 
> need to be able to get logs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-849) Redesign PipelineResult API

Reply via email to