Adding a few folks who might help +Chamikara Jayalath <[email protected]>
 +Lukasz Cwik <[email protected]> +Maximilian Michels
<[email protected]>

On Sat, Aug 17, 2019 at 11:27 AM Chad Dombrova <[email protected]> wrote:

> Hi all,
> I've got a PR[1] <https://github.com/apache/beam/pull/9268> for adding
> external transform support to PubsubIO so that it will work with python and
> go pipelines on Flink, and I am *so* close, but I've run into questions
> that the code cannot answer:  I need a human now.
>
> The brief summary is that everything is working as expected right up to
> the point where the Flink portable translator is prepping the pipeline for
> submission to Flink.  In this process it is replacing
> the PubsubMessageWithAttributesCoder with a ByteArrayCoder, in an effort to
> make it a wire-compatible.  Naturally, the ByteArrayCoder fails to encode
> the PubsubMessage that's passed to it.
>
> I don't understand why this replacement is necessary, since the next
> transform in the chain is a java ParDo that seems like it should be fully
> capable of using PubsubMessageWithAttributesCoder.
>
> The pipeline looks like this:
>
> --- java ---
> Read<PBegin, PubsubMessage>
> ParDo<PubsubMessage, PubsubMessage>
> ParDo<PubsubMessage, byte[]>
> --- python --
> ParDo<byte[], PubsubMessage>
>
> I'm really not sure what the right solution is.  My next course of action
> is going to be to compare this to the KafkaIO behavior, since this is the
> only other external Read that's supported.
>
> Any help is greatly appreciated!
>
> -chad
>
> [1] https://github.com/apache/beam/pull/9268
>
>

Reply via email to