[
https://issues.apache.org/jira/browse/BEAM-3851?focusedWorklogId=80443&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80443
]
ASF GitHub Bot logged work on BEAM-3851:
----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Mar/18 18:24
Start Date: 14/Mar/18 18:24
Worklog Time Spent: 10m
Work Description: rangadi opened a new pull request #4868: [BEAM-3851]
Option to preserve element timestamp while publishing to Kafka.
URL: https://github.com/apache/beam/pull/4868
KafkaIO sink support for setting Kafka message timestamps based on element
timestamp. Otherwise there is no way for user to influence the timestamp of the
messages in Kafka sink.
The implementation for for normal sink (`KafkaWriter.java`) is trivial: Just
need to read the timestamp from the context. But EOS sink
(`KafkaExactlyOnceWriter.java`) changes are a bit more involved.
In the case of the latter, the elements go through couple of shuffles and we
need to include timestamp along with the the actual value. The implementation
wraps timestamp and input KVs in `TimestampedValue<>`. This changes
serialization of the elements shuffles. As a result EOS changes are not
backward compatible (with upgrade or while using save points). I think the use
of EOS sink is pretty minimal and this will have very little impact. I am not
sure what the best practice is for handling such incompatibility in Beam.
Ideally we want to error out early if a pipeline with exactly-once sink is
being updated from version 2.3 to 2.4. PLMK. I can move timestamp support in
EOS to Beam 3.0.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 80443)
Time Spent: 10m
Remaining Estimate: 0h
> Support element timestamps while publishing to Kafka.
> -----------------------------------------------------
>
> Key: BEAM-3851
> URL: https://issues.apache.org/jira/browse/BEAM-3851
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kafka
> Affects Versions: 2.3.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Priority: Major
> Fix For: 2.4.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> KafkaIO sink should support using input element timestamp for the message
> published to Kafka. Otherwise there is no way for user to influence the
> timestamp of the messages in Kafka sink.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)