TheNeuralBit commented on code in PR #22216:
URL: https://github.com/apache/beam/pull/22216#discussion_r951821329
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubCoderProviderRegistrar.java:
##########
@@ -37,6 +37,7 @@ public List<CoderProvider> getCoderProviders() {
TypeDescriptor.of(PubsubMessage.class),
PubsubMessageWithMessageIdCoder.of()),
CoderProviders.forCoder(
TypeDescriptor.of(PubsubMessage.class),
- PubsubMessageWithAttributesAndMessageIdCoder.of()));
+ PubsubMessageWithAttributesAndMessageIdCoder.of()),
+ CoderProviders.forCoder(TypeDescriptor.of(PubsubMessage.class),
PubsubMessageCoder.of()));
Review Comment:
Thanks! I dug into the infrastructure here, and it looks like we will always
just use the _first_ coder registered for a given type, so I think registering
these other three will be a no-op. In theory a coder could raise
`CannotProvideCoderException` and we'd move on to the next one:
https://github.com/apache/beam/blob/4ffeae4d2b800f2df36d2ea2eab549f2204d5691/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L678
but that's not going to happen here, `PubsubMessageWithAttributesCoder` will
never do that.
I think the actually important change is just adding
`readMessagesWithAttributesAndMessageIdAndOrderingKey` with the `setCoder` call.
It's worth noting that what we specify here will be used as the default in
any `PCollection<PubsubMessage>` that's not created by one of the `PubsubIO`
helper methods. So we should consider making it the one that will _always
work_. There are implications for breaking pipeline update though if we change
the default. My suggestion:
- Leave this as-is for now (don't add the new coder)
- We can file an issue to follow-up on changing this
(I gathered some context by looking at
https://github.com/apache/beam/pull/8370, and also spoke to @lcwik)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]