[ 
https://issues.apache.org/jira/browse/BEAM-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935429#comment-15935429
 ] 

Xu Mingmin commented on BEAM-1775:
----------------------------------

[~amitsela], I think both Flink and Spark can work with checkpoint. As 
checkpoint is not mandatory in Flink, it should still be able to recover from 
last point. Not familiar with Spark, you may help to check if it can handle 
with default TmpCheckpointDirFactory. 

More specifically, when KafkaCheckpointMark is null in below function call
{code}
public UnboundedKafkaReader<K, V> createReader(PipelineOptions options,
                                                   KafkaCheckpointMark 
checkpointMark)
{code}

> fix issue of start_from_previous_offset in KafkaIO
> --------------------------------------------------
>
>                 Key: BEAM-1775
>                 URL: https://issues.apache.org/jira/browse/BEAM-1775
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Xu Mingmin
>            Assignee: Xu Mingmin
>
> Jins George [email protected] via aermail.onmicrosoft.com 
>       
> 5:50 PM (15 hours ago)
>       
> to user
> Hello,
> I am writing a Beam pipeline(streaming) with Flink runner to consume data 
> from Kafka and apply some transformations and persist to Hbase.
> If I restart the application ( due to failure/manual restart), consumer does 
> not resume from the offset where it was prior to restart. It always resume 
> from the latest offset.
> If I enable Flink checkpionting with hdfs state back-end, system appears to 
> be resuming from the earliest offset
> Is there a recommended way to resume from the offset where it was stopped ?
> Thanks,
> Jins George



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to