[
https://issues.apache.org/jira/browse/FLINK-13895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923213#comment-16923213
]
Andrey Zagrebin commented on FLINK-13895:
-----------------------------------------
>From the logs, looks like the application killing hangs because the client
>cannot connect to the yarn cluster RM, some networking, non-Flink issue per se.
The ConfiguredRMFailoverProxyProvider could be probably reconfigured to do
limited number of reconnection retries and prevent Flink cli from hanging.
>From the source code of ConfiguredRMFailoverProxyProvider.init, it looks like
>yarn.client.failover-retries is the option to tweak (if the default zero value
>probably means infinite retries). Not sure whether it makes to tweak this
>option in Flink for Yarn deployments by default.
[[email protected]] could you try to set [yarn.client.failover-retries or
yarn.client.failover-retries-on-socket-timeouts|[https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml]]
to some small value to see if reconnection attempts stop and cli exits?
> Client does not exit when bin/yarn-session.sh come fail
> -------------------------------------------------------
>
> Key: FLINK-13895
> URL: https://issues.apache.org/jira/browse/FLINK-13895
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / YARN
> Affects Versions: 1.9.0
> Reporter: Yu Wang
> Priority: Minor
> Labels: pull-request-available
> Attachments: client_exit.txt
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> the hadoop cluster environment java version is 1.7, flink is compiled with
> jdk1.8,I used bin/yarn-session.sh submit it , then client comes error and
> does not exit . I found yarn application which is failed , so then we should
> not kill the yarn application, we can stop the yarn client . attachments is
> operation log
--
This message was sent by Atlassian Jira
(v8.3.2#803003)