Re: Kafka Offset handling for Restart/failure scenarios.

Amit Sela Tue, 21 Mar 2017 13:37:17 -0700

On Tue, Mar 21, 2017 at 7:26 PM Mingmin Xu <mingm...@gmail.com> wrote:


> Move discuss to dev-list
>
> Savepoint in Flink, also checkpoint in Spark, should be good enough to
> handle this case.
>
> When people don't enable these features, for example only need at-most-once
>
The Spark runner forces checkpointing on any streaming (Beam) application,
mostly because it uses mapWithState for reading from UnboundedSource and
updateStateByKey form GroupByKey - so by design, Spark runner is
at-least-once. Generally, I always thought that applications that require
at-most-once are more focused on processing time only, as they only care
about whatever get's ingested into the pipeline at a specific time and
don't care (up to the point of losing data) about correctness.
I would be happy to hear more about your use case.

> semantic, each unbounded IO should try its best to restore from last
> offset, although CheckpointMark is null. Any ideas?
>
> Mingmin
>
> On Tue, Mar 21, 2017 at 9:39 AM, Dan Halperin <dhalp...@apache.org> wrote:
>
> > hey,
> >
> > The native Beam UnboundedSource API supports resuming from checkpoint --
> > that specifically happens here
> > <
> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L674>
> when
> > the KafkaCheckpointMark is non-null.
> >
> > The FlinkRunner should be providing the KafkaCheckpointMark from the most
> > recent savepoint upon restore.
> >
> > There shouldn't be any "special" Flink runner support needed, nor is the
> > State API involved.
> >
> > Dan
> >
> > On Tue, Mar 21, 2017 at 9:01 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
> > wrote:
> >
> >> Would not it be Flink runner specific ?
> >>
> >> Maybe the State API could do the same in a runner agnostic way (just
> >> thinking loud) ?
> >>
> >> Regards
> >> JB
> >>
> >> On 03/21/2017 04:56 PM, Mingmin Xu wrote:
> >>
> >>> From KafkaIO itself, looks like it either start_from_beginning or
> >>> start_from_latest. It's designed to leverage
> >>> `UnboundedSource.CheckpointMark`
> >>> during initialization, but so far I don't see it's provided by runners.
> >>> At the
> >>> moment Flink savepoints is a good option, created a JIRA(BEAM-1775
> >>> <https://issues.apache.org/jira/browse/BEAM-1775>)  to handle it in
> >>> KafkaIO.
> >>>
> >>> Mingmin
> >>>
> >>> On Tue, Mar 21, 2017 at 3:40 AM, Aljoscha Krettek <aljos...@apache.org
> >>> <mailto:aljos...@apache.org>> wrote:
> >>>
> >>>     Hi,
> >>>     Are you using Flink savepoints [1] when restoring your application?
> >>> If you
> >>>     use this the Kafka offset should be stored in state and it should
> >>> restart
> >>>     from the correct position.
> >>>
> >>>     Best,
> >>>     Aljoscha
> >>>
> >>>     [1]
> >>>     https://ci.apache.org/projects/flink/flink-docs-release-1.3/
> >>> setup/savepoints.html
> >>>     <https://ci.apache.org/projects/flink/flink-docs-release-1.3
> >>> /setup/savepoints.html>
> >>>     > On 21 Mar 2017, at 01:50, Jins George <jins.geo...@aeris.net
> >>>     <mailto:jins.geo...@aeris.net>> wrote:
> >>>     >
> >>>     > Hello,
> >>>     >
> >>>     > I am writing a Beam pipeline(streaming) with Flink runner to
> >>> consume data
> >>>     from Kafka and apply some transformations and persist to Hbase.
> >>>     >
> >>>     > If I restart the application ( due to failure/manual restart),
> >>> consumer
> >>>     does not resume from the offset where it was prior to restart. It
> >>> always
> >>>     resume from the latest offset.
> >>>     >
> >>>     > If I enable Flink checkpionting with hdfs state back-end, system
> >>> appears
> >>>     to be resuming from the earliest offset
> >>>     >
> >>>     > Is there a recommended way to resume from the offset where it was
> >>> stopped ?
> >>>     >
> >>>     > Thanks,
> >>>     > Jins George
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> ----
> >>> Mingmin
> >>>
> >>
> >> --
> >> Jean-Baptiste Onofré
> >> jbono...@apache.org
> >> http://blog.nanthrax.net
> >> Talend - http://www.talend.com
> >>
> >
> >
>
>
> --
> ----
> Mingmin
>

Re: Kafka Offset handling for Restart/failure scenarios.

Reply via email to