Dhiraj Hegde created HADOOP-17052:
-------------------------------------

             Summary: NetUtils.connect() throws an exception the prevents any 
retries when hostname resolution fails
                 Key: HADOOP-17052
                 URL: https://issues.apache.org/jira/browse/HADOOP-17052
             Project: Hadoop Common
          Issue Type: Bug
          Components: hdfs-client
    Affects Versions: 3.1.3, 3.2.1, 2.9.2, 2.10.0
            Reporter: Dhiraj Hegde


Hadoop components are increasingly being deployed on VMs and containers. One 
aspect of this environment is that DNS is dynamic. Hostname records get 
modified (or deleted/recreated) as a container in Kubernetes (or even VM) is 
being created/recreated. In such dynamic environments, the initial DNS 
resolution request might return resolution failure briefly as DNS client 
doesn't always get the latest records. This has been observed in Kubernetes in 
particular. In such cases NetUtils.connect() appears to throw 
java.nio.channels.UnresolvedAddressException.  In much of Hadoop code (like 
DFSInputStream and DFSOutputStream), the code is designed to retry IOException. 
However, since UnresolvedAddressException is not child of IOException, no retry 
happens and the code aborts immediately. It is much better if 
NetUtils.connect() throws java.net.UnknownHostException as that is derived from 
IOException and the code will treat this as a retry-able error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to