[ https://issues.apache.org/jira/browse/BEAM-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352322#comment-16352322 ]
Jean-Baptiste Onofré commented on BEAM-3186: -------------------------------------------- [~aljoscha] thanks, I cherry-pick on release-2.3.0 branch. > In-flight data loss when restoring from savepoint > ------------------------------------------------- > > Key: BEAM-3186 > URL: https://issues.apache.org/jira/browse/BEAM-3186 > Project: Beam > Issue Type: Bug > Components: runner-flink > Affects Versions: 2.0.0, 2.1.0 > Reporter: Pawel Bartoszek > Assignee: Dawid Wysakowicz > Priority: Blocker > Fix For: 2.3.0 > > Attachments: restore_no_trigger.png, restore_with_trigger.png, > restore_with_trigger_b.png > > > *The context:* > I want to count how many events of given type(A,B, etc) I receive every > minute using 1 minute windows and AfterWatermark trigger with allowed > lateness 1 min. > *Data loss case* > In the case below if there is at least one A element with the event time > belonging to the window 14:00-14:01 read from Kinesis stream after job is > restored from savepoint the data loss will not be observed for this key and > this window. > !restore_no_trigger.png! > *Not data loss case* > However, if no new A element element is read from Kinesis stream than data > loss is observable. > !restore_with_trigger.png! > *Workaround* > As a workaround we could configure early firings every X seconds which gives > up to X seconds data loss per key on restore. > *My guess where the issue might be* > I believe this is Beam-Flink integration layer bug. From my investigation I > don't think it's KinesisReader and possibility that it couldn't advance > watermark. To prove that after I restore from savepoint I sent some records > for different key (B) for the same window as shown in the > pictures(14:00-14:01) without seeing trigger going off for restored window > and key A. > My guess is that Beam after job is restored doesn't register flink event time > timer for restored window unless there is a new element (key) coming for the > restored window. > Please refer to [this > gist|https://gist.github.com/pbartoszek/7ab88c8b6538039db1b383358d1d1b5a] for > test job that shows this behaviour. -- This message was sent by Atlassian JIRA (v7.6.3#76005)