[ 
https://issues.apache.org/jira/browse/FLINK-22692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347375#comment-17347375
 ] 

Roman Khachatryan commented on FLINK-22692:
-------------------------------------------

I couldn't reproduce  the issue with the default scheduler.

However, it is easily reproducible with the adaptive scheduler (e.g. add 
.set(JobManagerOptions.SCHEDULER, JobManagerOptions.SchedulerType.Adaptive) to 
CheckpointStoreITCase on line 67).

 

Which shows that the behavior of Adaptive scheduler differs from the Default 
one: it doesn't restart the job if a failure occurs during recovery.

 

Is the difference above something intentional [~rmetzger]?

> CheckpointStoreITCase.testRestartOnRecoveryFailure fails with RuntimeException
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-22692
>                 URL: https://issues.apache.org/jira/browse/FLINK-22692
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>            Reporter: Robert Metzger
>            Assignee: Roman Khachatryan
>            Priority: Critical
>              Labels: test-stability
>
> Not sure if it is related to the adaptive scheduler: 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18052&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=a0a633b8-47ef-5c5a-2806-3c13b9e48228
> {code}
> May 17 22:29:11 [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 1.351 s <<< FAILURE! - in 
> org.apache.flink.test.checkpointing.CheckpointStoreITCase
> May 17 22:29:11 [ERROR] 
> testRestartOnRecoveryFailure(org.apache.flink.test.checkpointing.CheckpointStoreITCase)
>   Time elapsed: 1.138 s  <<< ERROR!
> May 17 22:29:11 org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
> May 17 22:29:11       at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> May 17 22:29:11       at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> May 17 22:29:11       at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> May 17 22:29:11       at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> May 17 22:29:11       at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1081)
> May 17 22:29:11       at akka.dispatch.OnComplete.internal(Future.scala:264)
> May 17 22:29:11       at akka.dispatch.OnComplete.internal(Future.scala:261)
> May 17 22:29:11       at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> May 17 22:29:11       at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> May 17 22:29:11       at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> May 17 22:29:11       at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> May 17 22:29:11       at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
> May 17 22:29:11       at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
> May 17 22:29:11       at 
> akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
> May 17 22:29:11       at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to