Does this mean that we have to deal with duplicate messages over the back edge? Or will that not happen, since duplicates mean that we rolled back a checkpoint.
On Tue, Oct 13, 2020 at 2:59 AM Maximilian Michels <[email protected]> wrote: > There would be ways around the lack of checkpointing in cycles, e.g. > buffer and backloop only after checkpointing is complete, similarly how > we implement @RequiresStableInput in the Flink Runner. > > -Max > > On 07.10.20 04:05, Reuven Lax wrote: > > It appears that there's a proposal > > ( > https://cwiki.apache.org/confluence/display/FLINK/FLIP-16%3A+Loop+Fault+Tolerance > > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-16%3A+Loop+Fault+Tolerance>) > > > and an abandoned PR to fix this, but AFAICT this remains a limitation of > > Flink. If Flink can't guarantee processing of records on back edges, I > > don't think we can use cycles, as we might otherwise lose the residuals. > > > > On Tue, Oct 6, 2020 at 6:16 PM Reuven Lax <[email protected] > > <mailto:[email protected]>> wrote: > > > > This is what I was thinking of > > > > "Flink currently only provides processing guarantees for jobs > > without iterations. Enabling checkpointing on an iterative job > > causes an exception. In order to force checkpointing on an iterative > > program the user needs to set a special flag when enabling > > checkpointing:|env.enableCheckpointing(interval, > > CheckpointingMode.EXACTLY_ONCE, force = true)|. > > > > Please note that records in flight in the loop edges (and the state > > changes associated with them) will be lost during failure." > > > > > > > > > > > > > > On Tue, Oct 6, 2020 at 5:44 PM Boyuan Zhang <[email protected] > > <mailto:[email protected]>> wrote: > > > > Hi Reuven, > > > > As Luke mentioned, at least there are some limitations around > > tracking watermark with flink cycles. I'm going to use State + > > Timer without flink cycle to support self-checkpoint. For > > dynamic split, we can either explore flink cycle approach or > > limit depth approach. > > > > On Tue, Oct 6, 2020 at 5:33 PM Reuven Lax <[email protected] > > <mailto:[email protected]>> wrote: > > > > Aren't there some limitations associated with flink cycles? > > I seem to remember various features that could not be used. > > I'm assuming that watermarks are not supported across > > cycles, but is there anything else? > > > > On Tue, Oct 6, 2020 at 7:12 AM Maximilian Michels > > <[email protected] <mailto:[email protected]>> wrote: > > > > Thanks for starting the conversation. The two approaches > > both look good > > to me. Probably we want to start with approach #1 for > > all Runners to be > > able to support delaying bundles. Flink supports cycles > > and thus > > approach #2 would also be applicable and could be used > > to implement > > dynamic splitting. > > > > -Max > > > > On 05.10.20 23:13, Luke Cwik wrote: > > > Thanks Boyuan, I left a few comments. > > > > > > On Mon, Oct 5, 2020 at 11:12 AM Boyuan Zhang > > <[email protected] <mailto:[email protected]> > > > <mailto:[email protected] > > <mailto:[email protected]>>> wrote: > > > > > > Hi team, > > > > > > I'm looking at adding self-checkpoint support to > > portable Flink > > > runner(BEAM-10940 > > > <https://issues.apache.org/jira/browse/BEAM-10940 > > <https://issues.apache.org/jira/browse/BEAM-10940>>) for > > both batch > > > and streaming. I summarized the problem that we > > want to solve and > > > proposed 2 potential approaches in this doc > > > > > < > https://docs.google.com/document/d/1372B7HYxtcUYjZOnOM7OBTfSJ4CyFg_gaPD_NUxWClo/edit?usp=sharing > < > https://docs.google.com/document/d/1372B7HYxtcUYjZOnOM7OBTfSJ4CyFg_gaPD_NUxWClo/edit?usp=sharing > >>. > > > > > > I want to collect feedback on which approach is > > preferred and > > > anything that I have not taken into consideration > > yet but I should. > > > Many thanks to all your help! > > > > > > Boyuan > > > > > >
