On Mon, Aug 19, 2019 at 5:48 PM Ahmet Altay <[email protected]> wrote:

> Adding a few folks who might help +Chamikara Jayalath
> <[email protected]> +Lukasz Cwik <[email protected]> +Maximilian Michels
> <[email protected]>
>
> On Sat, Aug 17, 2019 at 11:27 AM Chad Dombrova <[email protected]> wrote:
>
>> Hi all,
>> I've got a PR[1] <https://github.com/apache/beam/pull/9268> for adding
>> external transform support to PubsubIO so that it will work with python and
>> go pipelines on Flink, and I am *so* close, but I've run into questions
>> that the code cannot answer:  I need a human now.
>>
>> The brief summary is that everything is working as expected right up to
>> the point where the Flink portable translator is prepping the pipeline for
>> submission to Flink.  In this process it is replacing
>> the PubsubMessageWithAttributesCoder with a ByteArrayCoder, in an effort to
>> make it a wire-compatible.  Naturally, the ByteArrayCoder fails to encode
>> the PubsubMessage that's passed to it.
>>
>> I don't understand why this replacement is necessary, since the next
>> transform in the chain is a java ParDo that seems like it should be fully
>> capable of using PubsubMessageWithAttributesCoder.
>>
>
Not too familiar with Flink, but have you tried using PubSub source from a
pure Java Flink pipeline ? I expect this to work but haven't tested it. If
that works agree that the coder replacement sounds strange.


>
>> The pipeline looks like this:
>>
>> --- java ---
>> Read<PBegin, PubsubMessage>
>> ParDo<PubsubMessage, PubsubMessage>
>> ParDo<PubsubMessage, byte[]>
>> --- python --
>> ParDo<byte[], PubsubMessage>
>>
>> I'm really not sure what the right solution is.  My next course of action
>> is going to be to compare this to the KafkaIO behavior, since this is the
>> only other external Read that's supported.
>>
>
>> Any help is greatly appreciated!
>>
>> -chad
>>
>> [1] https://github.com/apache/beam/pull/9268
>>
>>

Reply via email to