[ https://issues.apache.org/jira/browse/FLINK-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364765#comment-17364765 ]
Yuan Mei commented on FLINK-22420: ---------------------------------- I did see in FLINK-21996 that a second option/solution is mentioned as well, which can prevent failover due to RPC loss. Given this happens rarely, I'd suggest to just keep it as it as until we have a conclusion on FLINK-21996 ? > UnalignedCheckpointITCase failed > -------------------------------- > > Key: FLINK-22420 > URL: https://issues.apache.org/jira/browse/FLINK-22420 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.14.0 > Reporter: Guowei Ma > Priority: Minor > Labels: auto-deprioritized-major, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17052&view=logs&j=34f41360-6c0d-54d3-11a1-0292a2def1d9&t=2d56e022-1ace-542f-bf1a-b37dd63243f2&l=9442 > {code:java} > Apr 22 14:28:21 at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > Apr 22 14:28:21 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > Apr 22 14:28:21 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > Apr 22 14:28:21 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > Apr 22 14:28:21 at akka.actor.Actor$class.aroundReceive(Actor.scala:517) > Apr 22 14:28:21 at > akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > Apr 22 14:28:21 at > akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > Apr 22 14:28:21 at akka.actor.ActorCell.invoke(ActorCell.scala:561) > Apr 22 14:28:21 at > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > Apr 22 14:28:21 at akka.dispatch.Mailbox.run(Mailbox.scala:225) > Apr 22 14:28:21 at akka.dispatch.Mailbox.exec(Mailbox.scala:235) > Apr 22 14:28:21 at > akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > Apr 22 14:28:21 at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > Apr 22 14:28:21 at > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > Apr 22 14:28:21 at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Apr 22 14:28:21 Caused by: org.apache.flink.util.FlinkException: An > OperatorEvent from an OperatorCoordinator to a task was lost. Triggering task > failover to ensure consistency. Event: '[NoMoreSplitEvent]', targetTask: > Source: source (1/1) - execution #5 > Apr 22 14:28:21 ... 26 more > Apr 22 14:28:21 > {code} > As described in the comment > https://issues.apache.org/jira/browse/FLINK-21996?focusedCommentId=17326449&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17326449 > we might need to adjust the tests to allow failover. -- This message was sent by Atlassian Jira (v8.3.4#803005)