[ 
https://issues.apache.org/jira/browse/FLINK-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907096#comment-16907096
 ] 

Till Rohrmann commented on FLINK-11835:
---------------------------------------

The update from Florian is the following: He created a git branch to make the 
problem reproducible: 
https://github.com/florianschmidt1994/flink/tree/detect-zookeeper-it-case-bug. 
In particular if one lets the thread sleep in {{JobManagerRunner::closeAsync}} 
(line 192 ff), the problem occurred.

The problem occurs in the second iteration when running the test in a 
loop/repeatedly. The problem seems to be that the {{Dispatcher}} requests the 
{{JobStatus}} of a job which no longer exists (for whatever reason). The 
{{JobStatus}} future will be completed exceptionally at {{Dispatcher.java:817}} 
because the {{JobManagerFuture}} for the given {{JobID}} is no longer in the 
{{JobManagerFutures}} collection.

> ZooKeeperLeaderElectionITCase#testJobExecutionOnClusterWithLeaderChange failed
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-11835
>                 URL: https://issues.apache.org/jira/browse/FLINK-11835
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.8.0
>            Reporter: Gary Yao
>            Assignee: Chesnay Schepler
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.10.0
>
>         Attachments: scratch_22.txt
>
>
> {noformat}
> 20:44:07.264 [ERROR] 
> testJobExecutionOnClusterWithLeaderChange(org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase)
>   Time elapsed: 4.625 s  <<< ERROR!
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find 
> Flink job (2e957dc4f49feaed042eb8b4a7932610)
>       at 
> org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange(ZooKeeperLeaderElectionITCase.java:152)
> Caused by: org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could 
> not find Flink job (2e957dc4f49feaed042eb8b4a7932610)
>       at 
> org.apache.flink.test.runtime.leaderelection.ZooKeeperLeaderElectionITCase.testJobExecutionOnClusterWithLeaderChange(ZooKeeperLeaderElectionITCase.java:149)
> {noformat}
> https://api.travis-ci.org/v3/job/502210892/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to