[
https://issues.apache.org/jira/browse/FLINK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717222#comment-14717222
]
Stephan Ewen commented on FLINK-2586:
-------------------------------------
Okay, I see. Is there a way to check the status of a topology? To check whether
it is still running or it failed, and if it failed, with what exception?
If that is possible, you can loop-sleep-wait until the job has failed, check
that the exception is the "SuccessException" and then validate the output in
the test driver. That should make the tests robust.
> Unstable Storm Compatibility Tests
> ----------------------------------
>
> Key: FLINK-2586
> URL: https://issues.apache.org/jira/browse/FLINK-2586
> Project: Flink
> Issue Type: Bug
> Components: Storm Compatibility
> Affects Versions: 0.10
> Reporter: Stephan Ewen
> Priority: Critical
> Fix For: 0.10
>
>
> The Storm Compatibility tests frequently fail.
> The reason is that they kill the topologies after a certain time interval.
> That may fail on CI infrastructure when certain steps are delayed beyond
> usual. Trying to guarantee progress by time is inherently problematic:
> - Waiting too short makes tests unstable
> - Waiting too long makes tests slow
> The right way to go is letting the program decide when to terminate, for
> example by throwing a special {{SuccessException}}.
> Have a look at the Kafka connector tests, they do this a lot and hence run
> exactly as short or as long as they need to.
> Here is an example of a failed run:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/77499577/log.txt
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)