This was mentioned in a separate thread but thought it would be good to highlight here in case more folks wish to take a look before the PR is merged.
PR is https://github.com/apache/beam/pull/13317 Thanks, Cham On Thu, Nov 12, 2020 at 1:17 PM Chamikara Jayalath <[email protected]> wrote: > Seems like a good place to promote this PR that adds documentation for > cross-language transforms :) > https://github.com/apache/beam/pull/13317 > > This covers the following for both Java and Python SDKs. > * Creating new cross-language transforms - primary audience will be > transform authors who wish to make existing Java/Python transforms > available to other SDKs. > * Using cross-language transforms - primary audience will be pipeline > authors that wish to use existing cross-language transforms with or without > language specific wrappers. > > Also this introduces the term "Multi-Language Pipelines" to denote > pipelines that use cross-language transforms (and hence utilize more than > one SDK language). > > Thanks +Dave Wrede <[email protected]> for working on this. > > - Cham > > On Thu, Nov 12, 2020 at 4:56 AM Ismaël Mejía <[email protected]> wrote: > >> I was not aware of these examples Brian, thanks for sharing. Maybe we >> should >> make these examples more discoverable on the website or as part of Beam's >> programming guide. >> >> It would be nice to have an example of the opposite too, calling a Python >> transform from Java. >> >> Additionally Java users who want to integrate python might be lost because >> External is NOT part of Beam's Java SDK (the transform is hidden inside >> of a >> different module core-construction-java), so it does not even appear in >> the >> website SDK javadoc. >> https://issues.apache.org/jira/browse/BEAM-8546 >> >> >> On Wed, Nov 11, 2020 at 8:41 PM Brian Hulette <[email protected]> >> wrote: >> > >> > Hi Ke, >> > >> > A cross-language pipeline looks a lot like a pipeline written natively >> in one of the Beam SDKs, the difference is that some of the transforms in >> the pipeline may be "external transforms" that actually have >> implementations in a different language. There are a few examples in the >> beam repo that use Java transforms from Python pipelines: >> > - kafkataxi [1]: Uses Java's KafkaIO from Python >> > - wordcount_xlang_sql [2] and sql_taxi [3]: Use Java's SqlTransform >> from Python >> > >> > To create your own cross-language pipeline, you'll need to decide which >> SDK you want to use primarily, and then create an expansion service to >> expose the transforms you want to use from the other SDK (if one doesn't >> exist already). >> > >> > [1] >> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/kafkataxi >> > [2] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang_sql.py >> > [3] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/sql_taxi.py >> > >> > On Wed, Nov 11, 2020 at 11:07 AM Ke Wu <[email protected]> wrote: >> >> >> >> Hello, >> >> >> >> Is there an example demonstrating how a cross language pipeline look >> like? e.g. a pipeline where it is composes of Java and Python >> code/transforms. >> >> >> >> Best, >> >> Ke >> >
