[
https://issues.apache.org/jira/browse/BEAM-12297?focusedWorklogId=609212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-609212
]
ASF GitHub Bot logged work on BEAM-12297:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Jun/21 15:47
Start Date: 09/Jun/21 15:47
Worklog Time Spent: 10m
Work Description: TheNeuralBit edited a comment on pull request #14971:
URL: https://github.com/apache/beam/pull/14971#issuecomment-857811062
Cool, thanks @zhoufek!
> CC @TheNeuralBit since we recently discussed the issue of PubsubIO
requiring a Java class and how we would improve that requirement for
portability.
Note we already have
[ProtoDynamicMessageSchema](https://github.com/apache/beam/blob/master/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoDynamicMessageSchema.java)
which can make a `SchemaCoder<DynamicMesage>` given a proto descriptor. That
combined with this PR could be a good avenue for making the pubsub
TableProvider configurable via a proto descriptor rather than a proto class
name. (CC: @apilloud @robinyqiu)
EDIT: Oh I just looked at the code and this is already using
ProtoDynamicMessageSchema, so it will just work with Beam schemas :) I think
what we'd need to do to to use this approach in SQL and the PubSub
TableProvider is implement the logic in the
[ProtoPayloadSerializerProvider](https://github.com/apache/beam/blob/master/sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoPayloadSerializerProvider.java)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 609212)
Time Spent: 1h 10m (was: 1h)
> PubsubIO support for reading protos from schema
> -----------------------------------------------
>
> Key: BEAM-12297
> URL: https://issues.apache.org/jira/browse/BEAM-12297
> Project: Beam
> Issue Type: New Feature
> Components: io-java-gcp
> Reporter: Zachary Houfek
> Assignee: Zachary Houfek
> Priority: P2
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> We're trying to add a Dataflow template for Pubsub -> BigQuery where the
> Pubsub topic contains serialized proto data. This is similar to our existing
> template for Pubsub -> BigQuery using Avro.
> However, it seems the PubsubIO currently only supports protos where the class
> is known.[1] We'll need something similar to the generic Avro reader[2].
>
> [1]
> [https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java#L478]
> [2]
> [https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java#L515]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)