Yes, for sure. Support for this is available in some runners (like the Python Universal Local Runner and Flink) and actively being added to others (e.g. Dataflow). There are still some rough edges however--one currently must run an expansion service to define a pipeline step in an alternative environment (e.g. by registering your transforms and running https://github.com/apache/beam/blob/release-2.14.0/sdks/python/apache_beam/runners/portability/expansion_service_test.py). We'd like to make this process a lot smoother (and feedback would be appreciated).
On Sat, Jul 20, 2019 at 7:57 PM Chad Dombrova <[email protected]> wrote: > > Hi all, > I'm interested to know if others on the list would find value in the ability > to use multiple environments (e.g. docker images) within a single pipeline, > using some mechanism to identify the environment(s) that a transform should > use. It would be quite useful for us, since our transforms can have > conflicting python requirements, or worse, conflicting interpreter > requirements. Currently to solve this we have to break the pipeline up into > multiple pipelines and use pubsub to communicate between them, which is not > ideal. > > -chad >
