[jira] [Commented] (FLINK-14048) Flink client hangs after trying to kill Yarn Job during deployment

TisonKun (Jira) Wed, 11 Sep 2019 00:14:30 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927308#comment-16927308
 ]


TisonKun commented on FLINK-14048:
----------------------------------

[~gyfora] did you notice this problem when deploy per-job cluster? I find the 
relevant code snippet in {{CliFrontend#runProgram}} and it seems that when 
exception thrown(in this case, a signal cause exception) we don't close the 
{{ClusterClient}} properly. But it should only happen in per-job mode.

> Flink client hangs after trying to kill Yarn Job during deployment
> ------------------------------------------------------------------
>
>                 Key: FLINK-14048
>                 URL: https://issues.apache.org/jira/browse/FLINK-14048
>             Project: Flink
>          Issue Type: Improvement
>          Components: Client / Job Submission, Deployment / YARN
>            Reporter: Gyula Fora
>            Priority: Major
>
> If we kill the flink client run command from the terminal while deploying to 
> YARN (let's say we realize we used the wrong parameters), the YARN 
> application will be killed immediately but the client won't shut down.
> We get the following messages over and over:
> 19/09/10 23:35:55 INFO retry.RetryInvocationHandler: java.io.IOException: The 
> client is stopped, while invoking 
> ApplicationClientProtocolPBClientImpl.forceKillApplication over null after 14 
> failover attempts. Trying to failover after sleeping for 16296ms.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-14048) Flink client hangs after trying to kill Yarn Job during deployment

Reply via email to