Sam Le created BEAM-10958:
-----------------------------
Summary: WriteToPubsub with Protobuf message missing field
Key: BEAM-10958
URL: https://issues.apache.org/jira/browse/BEAM-10958
Project: Beam
Issue Type: Improvement
Components: io-py-gcp
Affects Versions: 2.23.0
Reporter: Sam Le
Hi, I am trying to test ordering_key (new beta feature of Google Pubsub).
However, I seem to hit the wall here. Google Protobuff standard message has
been described here
[https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#google.pubsub.v1.PubsubMessage]
. So, it seems that Apache Beam should have no issue to passing the new field
to Pubsub if the format of the message is correct. It does not seem to be a
case here. As far as, I have tried to go through the code, I believe that I
would only need to change
[https://github.com/apache/beam/blob/1a09a7576a4a0e22d32583c0c1c52b67970691d6/sdks/python/apache_beam/io/gcp/pubsub.py#L115]
{code:java}
//
def _to_proto_str(self):
msg = pubsub.types.pubsub_pb2.PubsubMessage()
msg.data = self.data
for key, value in iteritems(self.attributes):
msg.attributes[key] = value
msg.ordering_key = self.ordering_key
return msg.SerializeToString()
{code}
And then set with_attributes=true. So, WriteToPubSub would allow me to send out
a protobuff message to pubsub. However, at some point, the information
(ordering_key) is tripped, although, I could set custom attributes. I guess it
may be the dataflow_runner code, but it really difficult to navigate around the
code in that area. I have gone as far as create a new version of write to
WriteToPubSub, but it does not help. The field is still missing on the message
when it hits Google Pubsub Topic.
Could anyone point me to the right direction please?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)