Ah, yes, registering a RowCoder seems like a fine solution here.
(Either that or have a wrapping PTransform that explicitly converts to
Rows or similar.)

On Wed, Feb 19, 2020 at 10:03 PM Chad Dombrova <[email protected]> wrote:
>
> The Java deps are only half of the problem. The other half is that PubsubIO 
> and KafkaIO are using classes that do not have a python equivalent and thus 
> no universal coder.  The solution discussed in the issue I linked above was 
> to use row coder registries in Java, to convert from these types to rows / 
> schemas.
>
> Any thoughts on that?
>
> -chad
>
>
> On Wed, Feb 19, 2020 at 6:00 PM Robert Bradshaw <[email protected]> wrote:
>>
>> Hopefully this should be resovled by
>> https://issues.apache.org/jira/browse/BEAM-9229
>>
>> On Wed, Feb 19, 2020 at 5:52 PM Chad Dombrova <[email protected]> wrote:
>> >
>> > We are using external transforms to get access to PubSubIO within python.  
>> > It works well, but there is one major issue remaining to fix:  we have to 
>> > build a custom beam with a hack to add the PubSubIO java deps and fix up 
>> > the coders.  This affects KafkaIO as well.  There's an issue here: 
>> > https://issues.apache.org/jira/browse/BEAM-7870
>> >
>> > I consider this to be the most pressing problem with external transforms 
>> > right now.
>> >
>> > -chad
>> >
>> >
>> >
>> > On Wed, Feb 12, 2020 at 9:28 AM Chamikara Jayalath <[email protected]> 
>> > wrote:
>> >>
>> >>
>> >>
>> >> On Wed, Feb 12, 2020 at 8:10 AM Alexey Romanenko 
>> >> <[email protected]> wrote:
>> >>>
>> >>>
>> >>>> AFAIK, there's no official guide for cross-language pipelines. But 
>> >>>> there are examples and test cases you can use as reference such as:
>> >>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_xlang.py
>> >>>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIOExternalTest.java
>> >>>> https://github.com/apache/beam/blob/master/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/ValidateRunnerXlangTest.java
>> >>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/expansion_service_test.py
>> >>>
>> >>>
>> >>> I'm trying to work with tech writers to add more documentation related 
>> >>> to cross-language (in a few months). But any help related to documenting 
>> >>> what we have now is greatly appreciated.
>> >>>
>> >>>
>> >>> That would be great since now the information is a bit scattered over 
>> >>> different places. I’d be happy to help with any examples and their 
>> >>> testing that I hope I’ll have after a while.
>> >>
>> >>
>> >> Great.
>> >>
>> >>>
>> >>>
>> >>>> The runner and SDK supports are in working state I could say but not 
>> >>>> many IOs expose their cross-language interface yet (you can easily 
>> >>>> write cross-language configuration for any Python transforms by 
>> >>>> yourself though).
>> >>>
>> >>>
>> >>> Should mention here the test suites for portable Flink and Spark Heejong 
>> >>> added recently :)
>> >>>
>> >>> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_XVR_Flink/
>> >>> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_XVR_Spark/
>> >>>
>> >>>
>> >>> Nice! Looks like my question above about cross-language support in Spark 
>> >>> runner was redundant.
>> >>>
>> >>>
>> >>>>
>> >>>>
>> >>>>>
>> >>>>> - Is the information here 
>> >>>>> https://beam.apache.org/roadmap/connectors-multi-sdk/ up-to-date? Are 
>> >>>>> there any other entry points you can recommend?
>> >>>>
>> >>>>
>> >>>> I think it's up-to-date.
>> >>>
>> >>>
>> >>> Mostly up to date.  Testing status is more complete now and we are 
>> >>> actively working on getting the dependences story correct and adding 
>> >>> support for DataflowRunner.
>> >>>
>> >>>
>> >>> Are there any “umbrella" Jiras regarding cross-language support that I 
>> >>> can track?
>> >>
>> >>
>> >> I don't think we have an umbrella JIRA currently. I can create one and 
>> >> mention it in the roadmap.

Reply via email to