[ 
https://issues.apache.org/jira/browse/HADOOP-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399707#comment-16399707
 ] 

John Zhuge commented on HADOOP-12125:
-------------------------------------

[~shahrs87] and [~jlowe], any progress? We hit the same issue when the non-HA 
NN went down and AWS spun up another NN instance with a different IP address. 
Both Job History Server and Spark History Server were stuck because 
NameNodeProxy held on to the old IP address.

> Retrying UnknownHostException on a proxy does not actually retry hostname 
> resolution
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12125
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12125
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>            Reporter: Jason Lowe
>            Assignee: Rushabh S Shah
>            Priority: Major
>
> When RetryInvocationHandler attempts to retry an UnknownHostException the 
> hostname fails to be resolved again.  The InetSocketAddress in the 
> ConnectionId has cached the fact that the hostname is unresolvable, and when 
> the proxy tries to setup a new Connection object with that ConnectionId it 
> checks if the (cached) resolution result is unresolved and immediately throws.
> The end result is we sleep and retry for no benefit.  The hostname resolution 
> is never attempted again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to