[ https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322884#comment-14322884 ]
ASF GitHub Bot commented on FLINK-1556: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/406 [FLINK-1556] Corrects faulty JobClient behaviour in case of a submission failure If an error occurred during job submission, a ```SubmissionFailure``` is sent to the ```JobClient```. As a reaction, the ```JobClient``` terminated itself and sent the failure to the ```Client```. However, this does not necessarily mean that the job has reached a terminal state, because the failing procedure is executed asynchronously. The ```JobClient``` now waits until it receives a ```JobResult``` message indicating that the job has completed and all resources are properly returned. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink minorFixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/406.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #406 ---- commit 2f32e9c6b87b8e295f792c04306d78fbb858f80d Author: Till Rohrmann <trohrm...@apache.org> Date: 2015-02-16T09:17:21Z [FLINK-1556] [runtime] Corrects faulty JobClient behaviour in case of a submission failure ---- > JobClient does not wait until a job failed completely if submission exception > ----------------------------------------------------------------------------- > > Key: FLINK-1556 > URL: https://issues.apache.org/jira/browse/FLINK-1556 > Project: Flink > Issue Type: Bug > Reporter: Till Rohrmann > > If an exception occurs during job submission the {{JobClient}} received a > {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} > terminates itself and returns the error to the {{Client}}. This indicates to > the user that the job has been completely failed which is not necessarily > true. > If the user directly after such a failure submits another job, then it might > be the case that not all slots of the formerly failed job are returned. This > can lead to a {{NoRessourceAvailableException}}. > We can solve this problem by waiting for the completion of the job failure in > the {{JobClient}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)