We are using external transforms to get access to PubSubIO within python. It works well, but there is one major issue remaining to fix: we have to build a custom beam with a hack to add the PubSubIO java deps and fix up the coders. This affects KafkaIO as well. There's an issue here: https://issues.apache.org/jira/browse/BEAM-7870
I consider this to be the most pressing problem with external transforms right now. -chad On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath <[email protected]> wrote: > > > On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko <[email protected]> > wrote: > >> >> AFAIK, there's no official guide for cross-language pipelines. But there >>> are examples and test cases you can use as reference such as: >>> >>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py >>> >>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIOExternalTest.java >>> >>> https://github.com/apache/beam/blob/master/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/ValidateRunnerXlangTest.java >>> >>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service_test.py >>> >> >> I'm trying to work with tech writers to add more documentation related to >> cross-language (in a few months). But any help related to documenting what >> we have now is greatly appreciated. >> >> >> That would be great since now the information is a bit scattered over >> different places. I’d be happy to help with any examples and their testing >> that I hope I’ll have after a while. >> > > Great. > > >> >> The runner and SDK supports are in working state I could say but not many >>> IOs expose their cross-language interface yet (you can easily write >>> cross-language configuration for any Python transforms by yourself though). >>> >> >> Should mention here the test suites for portable Flink and Spark Heejong >> added recently :) >> >> >> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_XVR_Flink/ >> >> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_XVR_Spark/ >> >> >> Nice! Looks like my question above about cross-language support in Spark >> runner was redundant. >> >> >> >>> >>> >>>> - Is the information here >>>> https://beam.apache.org/roadmap/connectors-multi-sdk/ up-to-date? Are >>>> there any other entry points you can recommend? >>>> >>> >>> I think it's up-to-date. >>> >> >> Mostly up to date. Testing status is more complete now and we are >> actively working on getting the dependences story correct and adding >> support for DataflowRunner. >> >> >> Are there any “umbrella" Jiras regarding cross-language support that I >> can track? >> > > I don't think we have an umbrella JIRA currently. I can create one and > mention it in the roadmap. >
