[ 
https://issues.apache.org/jira/browse/BEAM-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409032#comment-17409032
 ] 

Ning commented on BEAM-10708:
-----------------------------

Thanks for asking. Switching to "FnApiRunner" makes the deprecation of runner 
api roundtrip feasible.

Our roundtrip *does nothing but making copies of pipelines*. 
{color:#505F79}It's something useful (to avoid corrupting the __main__ module 
in the REPL env) in the interactive scenario where the user 
applies-transforms-then-inspect-output one by one, but not needed for scenarios 
where the user creates-pipelines-then-execute one by one (or any other 
non-interactive use cases. Deep copying pipelines is not a feature because it's 
not needed.){color}

To deprecate it, instead of making copies of pipelines, we make copies of 
runner api protos.
Theoretically, runner api is the SDK-and-runner-independent definition of a 
Beam pipeline. Every runner implementation should be able to accept them for 
execution.
For DirectRunner, the right approach is to use its FnApiRunner implementation 
that is implemented to accept a runner api through "run_via_runner_api".

I hope DataflowRunner could also support "run_via_runner_api" to truly support 
mixing matching SDKs and runners envisioned 
[here|https://docs.google.com/document/d/1XYzb1Fnt2sam7u2MsGFaZp-2qSIGxUn66VLer-bcXAk/edit#heading=h.p6lvszfbmyj6].
 But to productionize a pipeline from notebooks to dataflow, we could have 
other workaround. Making copies of pipelines is not needed.

> InteractiveRunner cannot execute pipeline with cross-language transform
> -----------------------------------------------------------------------
>
>                 Key: BEAM-10708
>                 URL: https://issues.apache.org/jira/browse/BEAM-10708
>             Project: Beam
>          Issue Type: Bug
>          Components: cross-language
>            Reporter: Brian Hulette
>            Assignee: Ning
>            Priority: P2
>          Time Spent: 30h 50m
>  Remaining Estimate: 0h
>
> The InteractiveRunner crashes when given a pipeline that includes a 
> cross-language transform.
> Here's the example I tried to run in a jupyter notebook:
> {code:python}
> p = beam.Pipeline(InteractiveRunner())
> pc = (p | SqlTransform("""SELECT
>             CAST(1 AS INT) AS `id`,
>             CAST('foo' AS VARCHAR) AS `str`,
>             CAST(3.14  AS DOUBLE) AS `flt`"""))
> df = interactive_beam.collect(pc)
> {code}
> The problem occurs when 
> [pipeline_fragment.py|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L66]
>  creates a copy of the pipeline by [writing it to proto and reading it 
> back|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L120].
>  Reading it back fails because some of the pipeline is not written in Python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to