On Mon, Aug 19, 2019 at 5:48 PM Ahmet Altay <[email protected]> wrote:
> Adding a few folks who might help +Chamikara Jayalath > <[email protected]> +Lukasz Cwik <[email protected]> +Maximilian Michels > <[email protected]> > > On Sat, Aug 17, 2019 at 11:27 AM Chad Dombrova <[email protected]> wrote: > >> Hi all, >> I've got a PR[1] <https://github.com/apache/beam/pull/9268> for adding >> external transform support to PubsubIO so that it will work with python and >> go pipelines on Flink, and I am *so* close, but I've run into questions >> that the code cannot answer: I need a human now. >> >> The brief summary is that everything is working as expected right up to >> the point where the Flink portable translator is prepping the pipeline for >> submission to Flink. In this process it is replacing >> the PubsubMessageWithAttributesCoder with a ByteArrayCoder, in an effort to >> make it a wire-compatible. Naturally, the ByteArrayCoder fails to encode >> the PubsubMessage that's passed to it. >> >> I don't understand why this replacement is necessary, since the next >> transform in the chain is a java ParDo that seems like it should be fully >> capable of using PubsubMessageWithAttributesCoder. >> > Not too familiar with Flink, but have you tried using PubSub source from a pure Java Flink pipeline ? I expect this to work but haven't tested it. If that works agree that the coder replacement sounds strange. > >> The pipeline looks like this: >> >> --- java --- >> Read<PBegin, PubsubMessage> >> ParDo<PubsubMessage, PubsubMessage> >> ParDo<PubsubMessage, byte[]> >> --- python -- >> ParDo<byte[], PubsubMessage> >> >> I'm really not sure what the right solution is. My next course of action >> is going to be to compare this to the KafkaIO behavior, since this is the >> only other external Read that's supported. >> > >> Any help is greatly appreciated! >> >> -chad >> >> [1] https://github.com/apache/beam/pull/9268 >> >>
