Hi folks, I'd like to solicit feedback on the notion of using PubsubMessageWithAttributesAndMessageIdAndOrderingKeyCoder[1] as the default coder for Pubsub messages instead of the current default of PubsubMessageWithAttributesCoder.
Not long ago, support for reading and writing Pubsub messages in Beam including an OrderingKey was added[2]. Part of this change involved adding a new Coder for PubsubMessage in order to capture and propagate the orderingKey[1]. This change illuminated that in cases where the coder type for PubsubMessage is inferred, it is possible to accidentally and silently nullify fields like MessageId and OrderingKey in a way that is not at all obvious to users[3]. So far two potential drawbacks of this proposal have been identified: 1. Update compatibility for pipelines using PubsubIO might require users to explicitly specify the current default coder ( PubsubMessageWithAttributesCoder) 2. Messages would require a larger number of bytes to store as compared to the current default (which could again be overcome by users specifying the current default coder) What other potential drawbacks might there be? I look forward to hearing others' input! Thanks, Evan [1] https://github.com/apache/beam/pull/22216/files#diff-28243ab1f9eef144e45a9f6cb2e07fa1cf53c021ceaf733d92351254f38712fd [2] https://github.com/apache/beam/pull/22216 [3] https://github.com/apache/beam/issues/23525