[
https://issues.apache.org/jira/browse/FLINK-12048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804019#comment-16804019
]
TisonKun commented on FLINK-12048:
----------------------------------
I think this is caused because after FLINK-11718 we start a dispatcher and when
{{#onStart}} called in the main thread, the {{SubmittedJobGraphStore}} just
started.
For this test we need to find an approach to explicitly wait for the dispatcher
get started.
For production code we need to move {{pathCache.getListenable().addListener(new
SubmittedJobGraphsPathCacheListener());}} to
{{ZooKeeperSubmittedJobGraphStore#start}} and thus {{#onAddedJobGraph}} isn't
called before the {{ZooKeeperSubmittedJobGraphStore}} started.
Further and personally, we'd better nudge FLINK-10333 to remove
{{SubmittedJobGraphListener}}
cc [~till.rohrmann]
> ZooKeeperHADispatcherTest failed on Travis
> ------------------------------------------
>
> Key: FLINK-12048
> URL: https://issues.apache.org/jira/browse/FLINK-12048
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination, Tests
> Affects Versions: 1.9.0
> Reporter: Chesnay Schepler
> Priority: Critical
> Labels: test-stability
>
> https://travis-ci.org/apache/flink/builds/512077301
> {code}
> 01:14:56.351 [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time
> elapsed: 9.671 s <<< FAILURE! - in
> org.apache.flink.runtime.dispatcher.ZooKeeperHADispatcherTest
> 01:14:56.364 [ERROR]
> testStandbyDispatcherJobExecution(org.apache.flink.runtime.dispatcher.ZooKeeperHADispatcherTest)
> Time elapsed: 1.209 s <<< ERROR!
> org.apache.flink.runtime.util.TestingFatalErrorHandler$TestingException:
> org.apache.flink.runtime.dispatcher.DispatcherException: Could not start the
> added job d51eeb908f360e44c0a2004e00a6afd2
> at
> org.apache.flink.runtime.dispatcher.ZooKeeperHADispatcherTest.teardown(ZooKeeperHADispatcherTest.java:117)
> Caused by: org.apache.flink.runtime.dispatcher.DispatcherException: Could not
> start the added job d51eeb908f360e44c0a2004e00a6afd2
> Caused by: java.lang.IllegalStateException: Not running. Forgot to call
> start()?
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)