[ https://issues.apache.org/jira/browse/FLINK-17918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120903#comment-17120903 ]
Kurt Young commented on FLINK-17918: ------------------------------------ >From above comment, it seems that the problem is caused by *step 5* (the >checkpoint&state is inconsistent with stream operators)? > Blink Jobs are loosing data on recovery > --------------------------------------- > > Key: FLINK-17918 > URL: https://issues.apache.org/jira/browse/FLINK-17918 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Network > Affects Versions: 1.11.0 > Reporter: Piotr Nowojski > Assignee: Arvid Heise > Priority: Blocker > Fix For: 1.11.0 > > > After trying to enable unaligned checkpoints by default, a lot of Blink > streaming SQL/Table API tests containing joins or set operations are throwing > errors that are indicating we are loosing some data (full records, without > deserialisation errors). Example errors: > {noformat} > [ERROR] Failures: > [ERROR] JoinITCase.testFullJoinWithEqualPk:775 expected:<List(1,1, 2,2, > 3,3, null,4, null,5)> but was:<List(2,2, 3,3, null,1, null,4, null,5)> > [ERROR] JoinITCase.testStreamJoinWithSameRecord:391 expected:<List(1,1,1,1, > 1,1,1,1, 2,2,2,2, 2,2,2,2, 3,3,3,3, 3,3,3,3, 4,4,4,4, 4,4,4,4, 5,5,5,5, > 5,5,5,5)> but was:<List()> > [ERROR] SemiAntiJoinStreamITCase.testAntiJoin:352 expected:<0> but was:<1> > [ERROR] SetOperatorsITCase.testIntersect:55 expected:<MutableList(1,1,Hi, > 2,2,Hello, 3,2,Hello world)> but was:<List()> > [ERROR] JoinITCase.testJoinPushThroughJoin:1272 expected:<List(1,0,Hi, > 2,1,Hello, 2,1,Hello world)> but was:<List(2,1,Hello, 2,1,Hello world)> > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)