[
https://issues.apache.org/jira/browse/FLINK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440224#comment-17440224
]
Zhu Zhu commented on FLINK-24793:
---------------------------------
This IT case fails when it is testing using AdaptiveScheduler. The cause of the
exception is that execution history is missing in AdaptiveScheduler. When a
restarting happens, new execution graph will be re-generated, the attempt
number will be retained, while the prior executions are not inherited from the
previous execution graph.
Even if the problem above is solved, the case will still fail because local
recovery is not supported by AdaptiveScheduler yet (see FLINK-21450).
So for now I will annotate the tests with {{FailsWithAdaptiveScheduler}} so
that they can be skipped when testing AdaptiveScheduler.
> DefaultSchedulerLocalRecoveryITCase fails on AZP
> ------------------------------------------------
>
> Key: FLINK-24793
> URL: https://issues.apache.org/jira/browse/FLINK-24793
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.15.0
> Reporter: Till Rohrmann
> Assignee: Zhu Zhu
> Priority: Critical
> Labels: pull-request-available, test-stability
> Fix For: 1.15.0
>
>
> {{DefaultSchedulerLocalRecoveryITCase.testLocalRecoveryFull}} and
> {{DefaultSchedulerLocalRecoveryITCase.testLocalRecoveryRegion}} fails on AZP
> with:
> {code}
> Nov 04 23:01:32 java.lang.IllegalArgumentException: attempt does not exist
> Nov 04 23:01:32 at
> org.apache.flink.runtime.executiongraph.ArchivedExecutionVertex.getPriorExecutionAttempt(ArchivedExecutionVertex.java:109)
> Nov 04 23:01:32 at
> org.apache.flink.test.runtime.DefaultSchedulerLocalRecoveryITCase.assertNonLocalRecoveredTasksEquals(DefaultSchedulerLocalRecoveryITCase.java:92)
> Nov 04 23:01:32 at
> org.apache.flink.test.runtime.DefaultSchedulerLocalRecoveryITCase.testLocalRecoveryInternal(DefaultSchedulerLocalRecoveryITCase.java:80)
> Nov 04 23:01:32 at
> org.apache.flink.test.runtime.DefaultSchedulerLocalRecoveryITCase.testLocalRecoveryFull(DefaultSchedulerLocalRecoveryITCase.java:65)
> Nov 04 23:01:32 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> Nov 04 23:01:32 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Nov 04 23:01:32 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Nov 04 23:01:32 at java.lang.reflect.Method.invoke(Method.java:498)
> Nov 04 23:01:32 at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Nov 04 23:01:32 at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Nov 04 23:01:32 at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Nov 04 23:01:32 at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Nov 04 23:01:32 at
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> Nov 04 23:01:32 at
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Nov 04 23:01:32 at
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> Nov 04 23:01:32 at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Nov 04 23:01:32 at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Nov 04 23:01:32 at
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Nov 04 23:01:32 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Nov 04 23:01:32 at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Nov 04 23:01:32 at
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Nov 04 23:01:32 at
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Nov 04 23:01:32 at
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Nov 04 23:01:32 at
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Nov 04 23:01:32 at
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Nov 04 23:01:32 at
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Nov 04 23:01:32 at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> Nov 04 23:01:32 at
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> Nov 04 23:01:32 at
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> Nov 04 23:01:32 at
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> Nov 04 23:01:32 at
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=25983&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=4451
--
This message was sent by Atlassian Jira
(v8.20.1#820001)