[ 
https://issues.apache.org/jira/browse/BEAM-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935782#comment-15935782
 ] 

Xu Mingmin commented on BEAM-1775:
----------------------------------

Hi [[email protected]], 
1). for KafkaCheckpointMark and Beam state, I totally agree that's the right 
way to go;
2). my question mostly goes when user don't enable checkpoint for any reason 
with KafkaIO. 
In Kafka, 'earliest' means the beginning of a topic, 'latest' means the end of 
a topic. I think what users want is to 'consume from the offset of last run'. 
As Kafka 0.9+ can manage consumer offset, it's possible to support this feature 
in KafkaIO. 

> fix issue of start_from_previous_offset in KafkaIO
> --------------------------------------------------
>
>                 Key: BEAM-1775
>                 URL: https://issues.apache.org/jira/browse/BEAM-1775
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Xu Mingmin
>            Assignee: Xu Mingmin
>
> Jins George [email protected] via aermail.onmicrosoft.com 
>       
> 5:50 PM (15 hours ago)
>       
> to user
> Hello,
> I am writing a Beam pipeline(streaming) with Flink runner to consume data 
> from Kafka and apply some transformations and persist to Hbase.
> If I restart the application ( due to failure/manual restart), consumer does 
> not resume from the offset where it was prior to restart. It always resume 
> from the latest offset.
> If I enable Flink checkpionting with hdfs state back-end, system appears to 
> be resuming from the earliest offset
> Is there a recommended way to resume from the offset where it was stopped ?
> Thanks,
> Jins George



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to