[
https://issues.apache.org/jira/browse/BEAM-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132225#comment-17132225
]
Beam JIRA Bot commented on BEAM-4275:
-------------------------------------
This issue is assigned but has not received an update in 30 days so it has been
labeled "stale-assigned". If you are still working on the issue, please give an
update and remove the label. If you are no longer working on the issue, please
unassign so someone else may work on it. In 7 days the issue will be
automatically unassigned.
> Pubsub: add DirectRunner support for id_label and timestamp_attribute in
> Python SDK
> -----------------------------------------------------------------------------------
>
> Key: BEAM-4275
> URL: https://issues.apache.org/jira/browse/BEAM-4275
> Project: Beam
> Issue Type: Bug
> Components: runner-direct, sdk-py-core
> Reporter: Udi Meiri
> Assignee: Udi Meiri
> Priority: P2
> Labels: stale-assigned
> Time Spent: 1h
> Remaining Estimate: 0h
>
> At least for publishing (and maybe pulling) messages, non-Dataflow-based
> sources and sinks for Pub/Sub use the [public
> API|https://cloud.google.com/pubsub/docs/publisher] for Pub/Sub, which
> doesn't support id_label and timestamp_attribute settings.
> Publishing:
> id_label - add an attribute to each message with a unique value
> timestamp_attribute - add an attribute to each message with the publishing
> time as its value
> Pulling:
> id_label - use the value of this message attribute to deduplicate messages
> timestamp_attribute - use the value of this message attribute as the
> element's timestamp
>
> Implementation details: could probably create a pubsubio.py module, for reuse
> with other runners (i.e. implement Pub/Sub IO as PTransforms and not
> NativeSinks and Sources).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)