[
https://issues.apache.org/jira/browse/BEAM-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372042#comment-15372042
]
Daniel Halperin commented on BEAM-404:
--------------------------------------
It's actually a bit deeper than that.
PubsubIO right now, under the hood, is a function from a Cloud Pub/Sub message
to some {{T}} using a {{Coder<T>}}.
Cloud Pub/Sub messages have several components:
* a payload (byte [])
* message attributes (strings)
Right now, we use two special attributes (timestamp and record ID) to read
per-element metadata and set element-wise properties in {{UnboundedSource}}.
However, many users want additional access to the other message attribute,
which might contain user specific information. This feature request is to
provide some interface on PubsubIO to produce a {{PCollection<KV<T, Map<String,
String>>>}} from {{PubsubIO.Read}}, where the second half of the type is the
attribute map.
> PubsubIO should have a mode that supports maintaining message attributes.
> -------------------------------------------------------------------------
>
> Key: BEAM-404
> URL: https://issues.apache.org/jira/browse/BEAM-404
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-gcp
> Reporter: Daniel Halperin
>
> Right now, PubsubIO only lets uses access the message payload, decoded with
> the user-provided coder.
> We should add a mode in which the source can return a message with the
> metadata (attributes) as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)