[
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296813
]
ASF GitHub Bot logged work on BEAM-7738:
----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Aug/19 17:58
Start Date: 17/Aug/19 17:58
Worklog Time Spent: 10m
Work Description: chadrik commented on issue #9268: [BEAM-7738] Add
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522258539
Ok, I now know how the `PubsubMessageWithAttributesCoder` is getting swapped
with a `BytesCoder`, but it turns out it's intentional and I don't know why.
In `FlinkStreamingPortablePipelineTranslator.translateRead` when the coder
is initiated it is passed through
`LengthPrefixUnknownCoders.addLengthPrefixedCoder` which silently replaces all
non-model coders with `LengthPrefixCoder(BytesCoder)`. This _seems_ like the
sort of thing that should print a warning, since I assume that a broken
pipeline is a likely outcome.
I'm unclear why this coder swap is necessary. The java part of this
pipeline is a `Read -> ParDo -> ParDo`, shouldn't this segment be able to
utilize java-only coders (i.e. non-model coder)?
What's the proper solution here? The last java ParDo is the one that's
ensuring we have a byte array for sending to python, but evidently this needs
to be happening in the Read itself?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 296813)
Time Spent: 2h 10m (was: 2h)
> Support PubSubIO to be configured externally for use with other SDKs
> --------------------------------------------------------------------
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
> Issue Type: New Feature
> Components: io-java-gcp, runner-flink, sdk-py-core
> Reporter: Chad Dombrova
> Assignee: Chad Dombrova
> Priority: Major
> Labels: portability
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we
> should add support for PubSub.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)