[ 
https://issues.apache.org/jira/browse/FLINK-38223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18041245#comment-18041245
 ] 

Rui Fan commented on FLINK-38223:
---------------------------------

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=71115&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=1ffc5ec2-7913-50ff-0177-3fca16f1b8f0&l=72092

> ExecutionGraphRestartTest and ExecutionGraphCoLocationRestartTest are flaky 
> on master
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-38223
>                 URL: https://issues.apache.org/jira/browse/FLINK-38223
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 2.1.0
>            Reporter: Gustavo de Morais
>            Assignee: Fabian Paul
>            Priority: Major
>             Fix For: 2.3.0
>
>
> Both these suites are really  flaky on master. Tests like 
> testConstraintsAfterRestart and testCancelWhileFailing are constantly failing 
> CI pipelines with errors like.
> You can reproduce it locally by running the suite locally.
> {code:java}
> Aug 11 00:04:37 00:04:37.047 [ERROR] Errors: 
> Aug 11 00:04:37 00:04:37.047 [ERROR]   
> ExecutionGraphCoLocationRestartTest.testConstraintsAfterRestart:113 » Timeout 
> Not all executions fulfilled the predicate in time. {code}
> {code:java}
> org.opentest4j.AssertionFailedError: expected: RUNNING but was: 
> FAILINGExpected :RUNNINGActual   :FAILING<Click to see difference>
>       at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testCancelWhileFailing(ExecutionGraphRestartTest.java:217)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:568)   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec$$$capture(ForkJoinTask.java:373)
>   at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java)    
>     at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
>    at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)     
> at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) 
>        at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
>        Suppressed: java.lang.IllegalStateException: Free slot must not be 
> used.                at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)        
>        at 
> org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.releaseSlots(DefaultDeclarativeSlotPool.java:564)
>              at 
> org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.freeAndReleaseSlots(DefaultDeclarativeSlotPool.java:507)
>               at 
> org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool.releaseSlots(DefaultDeclarativeSlotPool.java:477)
>              at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolService.internalReleaseTaskManager(DeclarativeSlotPoolService.java:281)
>                at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolService.releaseAllTaskManagers(DeclarativeSlotPoolService.java:271)
>            at 
> org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolService.close(DeclarativeSlotPoolService.java:160)
>             at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testCancelWhileFailing(ExecutionGraphRestartTest.java:200)
>          ... 7 more
>  {code}
> {code:java}
> java.util.concurrent.TimeoutException: Not all executions fulfilled the 
> predicate in time.
>       at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphTestUtils.waitForAllExecutionsPredicate(ExecutionGraphTestUtils.java:203)
>       at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphCoLocationRestartTest.testConstraintsAfterRestart(ExecutionGraphCoLocationRestartTest.java:113)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:568)   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec$$$capture(ForkJoinTask.java:373)
>   at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java)    
>     at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
>    at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)     
> at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) 
>        at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
>  {code}
> CI Link example
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=69283&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=1ffc5ec2-7913-50ff-0177-3fca16f1b8f0]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to