[
https://issues.apache.org/jira/browse/FLINK-20816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312527#comment-17312527
]
Matthias commented on FLINK-20816:
----------------------------------
I was able to reproduce the timeout on AzureCI. I wasn't able to pinpoint the
reason by diffing the successful and failed run log files. But it looks like in
this specific case, the waiting for the abort notification of checkpoint #2 in
the DeclineSink does not get triggered. [~yunta] are you able to get anything
more out of the logs?
> NotifyCheckpointAbortedITCase failed due to timeout
> ---------------------------------------------------
>
> Key: FLINK-20816
> URL: https://issues.apache.org/jira/browse/FLINK-20816
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.12.2, 1.13.0
> Reporter: Matthias
> Assignee: Matthias
> Priority: Critical
> Labels: test-stability
> Fix For: 1.13.0
>
> Attachments: flink-20816-failure.log, flink-20816-success.log
>
>
> [This
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=152&view=logs&j=0a15d512-44ac-5ba5-97ab-13a5d066c22c&t=634cd701-c189-5dff-24cb-606ed884db87&l=4245]
> failed caused by a failing of {{NotifyCheckpointAbortedITCase}} due to a
> timeout.
> {code}
> 2020-12-29T21:48:40.9430511Z [INFO] Running
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase
> 2020-12-29T21:50:28.0087043Z [ERROR] Tests run: 2, Failures: 0, Errors: 1,
> Skipped: 0, Time elapsed: 107.062 s <<< FAILURE! - in
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase
> 2020-12-29T21:50:28.0087961Z [ERROR]
> testNotifyCheckpointAborted[unalignedCheckpointEnabled
> =true](org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase)
> Time elapsed: 104.044 s <<< ERROR!
> 2020-12-29T21:50:28.0088619Z org.junit.runners.model.TestTimedOutException:
> test timed out after 100000 milliseconds
> 2020-12-29T21:50:28.0088972Z at java.lang.Object.wait(Native Method)
> 2020-12-29T21:50:28.0089267Z at java.lang.Object.wait(Object.java:502)
> 2020-12-29T21:50:28.0089633Z at
> org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:61)
> 2020-12-29T21:50:28.0090458Z at
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.verifyAllOperatorsNotifyAborted(NotifyCheckpointAbortedITCase.java:200)
> 2020-12-29T21:50:28.0091313Z at
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted(NotifyCheckpointAbortedITCase.java:183)
> 2020-12-29T21:50:28.0091819Z at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-12-29T21:50:28.0092199Z at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-12-29T21:50:28.0092675Z at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-12-29T21:50:28.0093095Z at
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-12-29T21:50:28.0093495Z at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-12-29T21:50:28.0093980Z at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-12-29T21:50:28.0094444Z at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-12-29T21:50:28.0094917Z at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-12-29T21:50:28.0095663Z at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> 2020-12-29T21:50:28.0096221Z at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> 2020-12-29T21:50:28.0096675Z at
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-12-29T21:50:28.0097022Z at java.lang.Thread.run(Thread.java:748)
> {code}
> The branch contained changes from FLINK-20594 and FLINK-20595. These issues
> remove code that is not used anymore and should have had only affects on unit
> tests. [The previous
> build|https://dev.azure.com/mapohl/flink/_build/results?buildId=151&view=results]
> containing all the changes accept for
> [9c57c37|https://github.com/XComp/flink/commit/9c57c37c50733a1f592a4fc5e492b22be80d8279]
> passed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)