[
https://issues.apache.org/jira/browse/HADOOP-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399707#comment-16399707
]
John Zhuge commented on HADOOP-12125:
-------------------------------------
[~shahrs87] and [~jlowe], any progress? We hit the same issue when the non-HA
NN went down and AWS spun up another NN instance with a different IP address.
Both Job History Server and Spark History Server were stuck because
NameNodeProxy held on to the old IP address.
> Retrying UnknownHostException on a proxy does not actually retry hostname
> resolution
> ------------------------------------------------------------------------------------
>
> Key: HADOOP-12125
> URL: https://issues.apache.org/jira/browse/HADOOP-12125
> Project: Hadoop Common
> Issue Type: Bug
> Components: ipc
> Reporter: Jason Lowe
> Assignee: Rushabh S Shah
> Priority: Major
>
> When RetryInvocationHandler attempts to retry an UnknownHostException the
> hostname fails to be resolved again. The InetSocketAddress in the
> ConnectionId has cached the fact that the hostname is unresolvable, and when
> the proxy tries to setup a new Connection object with that ConnectionId it
> checks if the (cached) resolution result is unresolved and immediately throws.
> The end result is we sleep and retry for no benefit. The hostname resolution
> is never attempted again.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]