[
https://issues.apache.org/jira/browse/BEAM-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935782#comment-15935782
]
Xu Mingmin commented on BEAM-1775:
----------------------------------
Hi [[email protected]],
1). for KafkaCheckpointMark and Beam state, I totally agree that's the right
way to go;
2). my question mostly goes when user don't enable checkpoint for any reason
with KafkaIO.
In Kafka, 'earliest' means the beginning of a topic, 'latest' means the end of
a topic. I think what users want is to 'consume from the offset of last run'.
As Kafka 0.9+ can manage consumer offset, it's possible to support this feature
in KafkaIO.
> fix issue of start_from_previous_offset in KafkaIO
> --------------------------------------------------
>
> Key: BEAM-1775
> URL: https://issues.apache.org/jira/browse/BEAM-1775
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-extensions
> Reporter: Xu Mingmin
> Assignee: Xu Mingmin
>
> Jins George [email protected] via aermail.onmicrosoft.com
>
> 5:50 PM (15 hours ago)
>
> to user
> Hello,
> I am writing a Beam pipeline(streaming) with Flink runner to consume data
> from Kafka and apply some transformations and persist to Hbase.
> If I restart the application ( due to failure/manual restart), consumer does
> not resume from the offset where it was prior to restart. It always resume
> from the latest offset.
> If I enable Flink checkpionting with hdfs state back-end, system appears to
> be resuming from the earliest offset
> Is there a recommended way to resume from the offset where it was stopped ?
> Thanks,
> Jins George
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)