Seems like a good place to promote this PR that adds documentation for
cross-language transforms :)
https://github.com/apache/beam/pull/13317

This covers the following for both Java and Python SDKs.
* Creating new cross-language transforms - primary audience will be
transform authors who wish to make existing Java/Python transforms
available to other SDKs.
* Using cross-language transforms - primary audience will be pipeline
authors that wish to use existing cross-language transforms with or without
language specific wrappers.

Also this introduces the term "Multi-Language Pipelines" to denote
pipelines that use cross-language transforms (and hence utilize more than
one SDK language).

Thanks +Dave Wrede <[email protected]> for working on this.

- Cham

On Thu, Nov 12, 2020 at 4:56 AM Ismaël Mejía <[email protected]> wrote:

> I was not aware of these examples Brian, thanks for sharing. Maybe we
> should
> make these examples more discoverable on the website or as part of Beam's
> programming guide.
>
> It would be nice to have an example of the opposite too, calling a Python
> transform from Java.
>
> Additionally Java users who want to integrate python might be lost because
> External is NOT part of Beam's Java SDK (the transform is hidden inside of
> a
> different module core-construction-java), so it does not even appear in the
> website SDK javadoc.
> https://issues.apache.org/jira/browse/BEAM-8546
>
>
> On Wed, Nov 11, 2020 at 8:41 PM Brian Hulette <[email protected]> wrote:
> >
> > Hi Ke,
> >
> > A cross-language pipeline looks a lot like a pipeline written natively
> in one of the Beam SDKs, the difference is that some of the transforms in
> the pipeline may be "external transforms" that actually have
> implementations in a different language. There are a few examples in the
> beam repo that use Java transforms from Python pipelines:
> > - kafkataxi [1]: Uses Java's KafkaIO from Python
> > - wordcount_xlang_sql [2] and sql_taxi [3]: Use Java's SqlTransform from
> Python
> >
> > To create your own cross-language pipeline, you'll need to decide which
> SDK you want to use primarily, and then create an expansion service to
> expose the transforms you want to use from the other SDK (if one doesn't
> exist already).
> >
> > [1]
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/kafkataxi
> > [2]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang_sql.py
> > [3]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/sql_taxi.py
> >
> > On Wed, Nov 11, 2020 at 11:07 AM Ke Wu <[email protected]> wrote:
> >>
> >> Hello,
> >>
> >> Is there an example demonstrating how a cross language pipeline look
> like? e.g. a pipeline where it is composes of Java and Python
> code/transforms.
> >>
> >> Best,
> >> Ke
>

Reply via email to