[ https://issues.apache.org/jira/browse/FLINK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235861#comment-17235861 ]
jiawen xiao commented on FLINK-19775: ------------------------------------- hi, [~dian.fu],[~trohrmann],After my recent research,this question is similar to https://issues.apache.org/jira/browse/FLINK-6571 I summarized everyone's description and my own thoughts. First of all, this is a type of problem, and there will be instability problems. Perhaps it is time to find the reason why the main thread is interrupted by other threads. Secondly, make an assumption that when the main thread is lock.wait(), the child thread where latch.trigger() is located is not scheduled immediately by the cpu, so triggered=false is unchanged. If the main thread is interrupted by other threads, it will cause a hot loop problem in the await() method and continuously throw InterruptedException exceptions. Finally, I think that simple catch exceptions cannot solve the instability problem. We should change the loop flag to false while catching the exception. Do you have any suggestions? > SystemProcessingTimeServiceTest.testImmediateShutdown is instable > ----------------------------------------------------------------- > > Key: FLINK-19775 > URL: https://issues.apache.org/jira/browse/FLINK-19775 > Project: Flink > Issue Type: Bug > Components: API / DataStream > Affects Versions: 1.11.0 > Reporter: Dian Fu > Assignee: jiawen xiao > Priority: Major > Labels: pull-request-available, test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8131&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=66b5c59a-0094-561d-0e44-b149dfdd586d > {code} > 2020-10-22T21:12:54.9462382Z [ERROR] > testImmediateShutdown(org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeServiceTest) > Time elapsed: 0.009 s <<< ERROR! > 2020-10-22T21:12:54.9463024Z java.lang.InterruptedException > 2020-10-22T21:12:54.9463331Z at java.lang.Object.wait(Native Method) > 2020-10-22T21:12:54.9463766Z at java.lang.Object.wait(Object.java:502) > 2020-10-22T21:12:54.9464140Z at > org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:63) > 2020-10-22T21:12:54.9466014Z at > org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeServiceTest.testImmediateShutdown(SystemProcessingTimeServiceTest.java:154) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)