[
https://issues.apache.org/jira/browse/FLINK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494482#comment-17494482
]
Roman Khachatryan commented on FLINK-26154:
-------------------------------------------
>From the log, I see that a savepoint was created and the job recovered from it
>successfully.
But then cluster finalization was triggered and job was cancelled:
{code}
01:17:29,185 [jobmanager-io-thread-1] INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job
c42be0d0def476a30c4294f4bf589a43 from Checkpoint 2 @ 0 for
c42be0d0def476a30c4294f4bf589a43 located at file:/tmp/junit2973919816853164300.
...
01:17:29,579 [ Map (1/4)#0] INFO
org.apache.flink.runtime.state.restore.FullSnapshotRestoreOperation [] -
Finished restoring from state handle:
KeyGroupsSavepointStateHandle{groupRangeOffsets=KeyGroupRangeOffsets{keyGroupRange=KeyGroupRange{startKeyGroup=0,
endKeyGroup=31}, offsets=[369, 503, 845, 1057, 1321, 1637, 1823, 1983, 2325,
2537, 2697, 2961, 3251, 3437, 3571, 3939, 4229, 4363, 4601, 4761, 4999, 5107,
5241, 5453, 5587, 5825, 6011, 6275, 6617, 6907, 7145, 7409]},
stateHandle=RelativeFileStateHandle State:
file:/tmp/junit2973919816853164300/d1989182-8776-4fde-aca0-79f579c8ae8b,
d1989182-8776-4fde-aca0-79f579c8ae8b [7543 bytes]}.
...
01:17:29,699 [ main] INFO
org.apache.flink.test.util.MiniClusterWithClientResource [] - Finalization
triggered: Cluster shutdown is going to be initiated.
01:17:29,745 [flink-akka.actor.default-dispatcher-10] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job Flink
Streaming Job (c42be0d0def476a30c4294f4bf589a43) switched from state RUNNING to
CANCELLING.
01:17:29,745 [flink-akka.actor.default-dispatcher-10] INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source:
Sequence Source (1/4) (7dd2dcdb42fc95060c7280a1eea3105b) switched from RUNNING
to CANCELING.
{code}
> SavepointFormatITCase times out on azure
> ----------------------------------------
>
> Key: FLINK-26154
> URL: https://issues.apache.org/jira/browse/FLINK-26154
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends, Tests
> Affects Versions: 1.15.0
> Reporter: Roman Khachatryan
> Priority: Blocker
> Fix For: 1.15.0
>
>
> Originally reported in FLINK-26144.
>
> [https://dev.azure.com/mapohl/flink/_build/results?buildId=738&view=logs&j=0a15d512-44ac-5ba5-97ab-13a5d066c22c&t=9a028d19-6c4b-5a4e-d378-03fca149d0b1&l=6340]
>
> {code}
> Feb 15 01:26:50 [ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 0,
> Time elapsed: 591.027 s <<< FAILURE! - in
> org.apache.flink.test.checkpointing.SavepointFormatITCase
> Feb 15 01:26:50 [ERROR]
> org.apache.flink.test.checkpointing.SavepointFormatITCase.testTriggerSavepointAndResumeWithFileBasedCheckpointsAndRelocateBasePath(SavepointFormatType,
> StateBackendConfig)[2] Time elapsed: 261.901 s <<< ERROR!
> Feb 15 01:26:50 java.util.concurrent.TimeoutException: Condition was not met
> in given timeout.
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:166)
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:144)
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:136)
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForAllTaskRunning(CommonTestUtils.java:210)
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForAllTaskRunning(CommonTestUtils.java:184)
> Feb 15 01:26:50 at
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForAllTaskRunning(CommonTestUtils.java:172)
> Feb 15 01:26:50 at
> org.apache.flink.test.checkpointing.SavepointFormatITCase.relocateAndVerify(SavepointFormatITCase.java:306)
> Feb 15 01:26:50 at
> org.apache.flink.test.checkpointing.SavepointFormatITCase.testTriggerSavepointAndResumeWithFileBasedCheckpointsAndRelocateBasePath(SavepointFormatITCase.java:260)
> Feb 15 01:26:50 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> Feb 15 01:26:50 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Feb 15 01:26:50 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Feb 15 01:26:50 at java.lang.reflect.Method.invoke(Method.java:498)
> Feb 15 01:26:50 at
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestTemplateMethod(TimeoutExtension.java:92)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
> Feb 15 01:26:50 at
> org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)