Re: Portability framework: multiple environments in one pipeline

Robert Bradshaw Mon, 22 Jul 2019 00:32:31 -0700

Yes, for sure. Support for this is available in some runners (like the
Python Universal Local Runner and Flink) and actively being added to
others (e.g. Dataflow). There are still some rough edges however--one
currently must run an expansion service to define a pipeline step in
an alternative environment (e.g. by registering your transforms and
running 
https://github.com/apache/beam/blob/release-2.14.0/sdks/python/apache_beam/runners/portability/expansion_service_test.py).
We'd like to make this process a lot smoother (and feedback would be
appreciated).


On Sat, Jul 20, 2019 at 7:57 PM Chad Dombrova <[email protected]> wrote:
>
> Hi all,
> I'm interested to know if others on the list would find value in the ability 
> to use multiple environments (e.g. docker images) within a single pipeline, 
> using some mechanism to identify the environment(s) that a transform should 
> use. It would be quite useful for us, since our transforms can have 
> conflicting python requirements, or worse, conflicting interpreter 
> requirements.  Currently to solve this we have to break the pipeline up into 
> multiple pipelines and use pubsub to communicate between them, which is not 
> ideal.
>
> -chad
>

Re: Portability framework: multiple environments in one pipeline

Reply via email to