[
https://issues.apache.org/jira/browse/FLINK-23492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389948#comment-17389948
]
Nicolaus Weidner commented on FLINK-23492:
------------------------------------------
Looking into this a bit, I suspect it's a race condition: In the test, we set a
cleanup interval of 10ms. This is probably low enough that sporadically,
between querying that the result is now available and actually fetching the
result, it is lost. According to the docs of the cache being used, elements are
automatically removed after the interval. The code contains explicit cleanup
calls, though I am not sure what purpose they serve (tests pass without).
I experimented locally and can only reproduce this exception when setting the
cleanup interval to 1ms, but then it is pretty consistent.
> JobVertexThreadInfoTrackerTest.testCachedStatsCleanedAfterCleanupInterval
> fails on Azure
> ----------------------------------------------------------------------------------------
>
> Key: FLINK-23492
> URL: https://issues.apache.org/jira/browse/FLINK-23492
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.13.1
> Reporter: Dawid Wysakowicz
> Assignee: Nicolaus Weidner
> Priority: Major
> Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20898&view=logs&j=34f41360-6c0d-54d3-11a1-0292a2def1d9&t=2d56e022-1ace-542f-bf1a-b37dd63243f2&l=7017
> {code}
> Jul 23 22:19:29 [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0,
> Time elapsed: 1.967 s <<< FAILURE! - in
> org.apache.flink.runtime.webmonitor.threadinfo.JobVertexThreadInfoTrackerTest
> Jul 23 22:19:29 [ERROR]
> testCachedStatsCleanedAfterCleanupInterval(org.apache.flink.runtime.webmonitor.threadinfo.JobVertexThreadInfoTrackerTest)
> Time elapsed: 0.024 s <<< FAILURE!
> Jul 23 22:19:29 java.lang.AssertionError
> Jul 23 22:19:29 at org.junit.Assert.fail(Assert.java:86)
> Jul 23 22:19:29 at org.junit.Assert.assertTrue(Assert.java:41)
> Jul 23 22:19:29 at org.junit.Assert.assertTrue(Assert.java:52)
> Jul 23 22:19:29 at
> org.apache.flink.runtime.webmonitor.threadinfo.JobVertexThreadInfoTrackerTest.assertExpectedEqualsReceived(JobVertexThreadInfoTrackerTest.java:231)
> Jul 23 22:19:29 at
> org.apache.flink.runtime.webmonitor.threadinfo.JobVertexThreadInfoTrackerTest.doInitialRequestAndVerifyResult(JobVertexThreadInfoTrackerTest.java:224)
> Jul 23 22:19:29 at
> org.apache.flink.runtime.webmonitor.threadinfo.JobVertexThreadInfoTrackerTest.testCachedStatsCleanedAfterCleanupInterval(JobVertexThreadInfoTrackerTest.java:178)
> Jul 23 22:19:29 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> Jul 23 22:19:29 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 23 22:19:29 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 23 22:19:29 at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 23 22:19:29 at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> Jul 23 22:19:29 at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Jul 23 22:19:29 at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> Jul 23 22:19:29 at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Jul 23 22:19:29 at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> Jul 23 22:19:29 at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> Jul 23 22:19:29 at
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Jul 23 22:19:29 at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)