[jira] [Updated] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
[ https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhiraj Hegde updated HADOOP-17052: -- Attachment: read_failure.log > NetUtils.connect() throws an exception the prevents any retries when hostname > resolution fails > -- > > Key: HADOOP-17052 > URL: https://issues.apache.org/jira/browse/HADOOP-17052 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3 >Reporter: Dhiraj Hegde >Assignee: Dhiraj Hegde >Priority: Major > Attachments: read_failure.log, write_failure1.log, write_failure2.log > > > Hadoop components are increasingly being deployed on VMs and containers. One > aspect of this environment is that DNS is dynamic. Hostname records get > modified (or deleted/recreated) as a container in Kubernetes (or even VM) is > being created/recreated. In such dynamic environments, the initial DNS > resolution request might return resolution failure briefly as DNS client > doesn't always get the latest records. This has been observed in Kubernetes > in particular. In such cases NetUtils.connect() appears to throw > java.nio.channels.UnresolvedAddressException. In much of Hadoop code (like > DFSInputStream and DFSOutputStream), the code is designed to retry > IOException. However, since UnresolvedAddressException is not child of > IOException, no retry happens and the code aborts immediately. It is much > better if NetUtils.connect() throws java.net.UnknownHostException as that is > derived from IOException and the code will treat this as a retry-able error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
[ https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhiraj Hegde updated HADOOP-17052: -- Attachment: write_failure2.log > NetUtils.connect() throws an exception the prevents any retries when hostname > resolution fails > -- > > Key: HADOOP-17052 > URL: https://issues.apache.org/jira/browse/HADOOP-17052 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3 >Reporter: Dhiraj Hegde >Assignee: Dhiraj Hegde >Priority: Major > Attachments: read_failure.log, write_failure1.log, write_failure2.log > > > Hadoop components are increasingly being deployed on VMs and containers. One > aspect of this environment is that DNS is dynamic. Hostname records get > modified (or deleted/recreated) as a container in Kubernetes (or even VM) is > being created/recreated. In such dynamic environments, the initial DNS > resolution request might return resolution failure briefly as DNS client > doesn't always get the latest records. This has been observed in Kubernetes > in particular. In such cases NetUtils.connect() appears to throw > java.nio.channels.UnresolvedAddressException. In much of Hadoop code (like > DFSInputStream and DFSOutputStream), the code is designed to retry > IOException. However, since UnresolvedAddressException is not child of > IOException, no retry happens and the code aborts immediately. It is much > better if NetUtils.connect() throws java.net.UnknownHostException as that is > derived from IOException and the code will treat this as a retry-able error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
[ https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhiraj Hegde updated HADOOP-17052: -- Attachment: write_failure1.log > NetUtils.connect() throws an exception the prevents any retries when hostname > resolution fails > -- > > Key: HADOOP-17052 > URL: https://issues.apache.org/jira/browse/HADOOP-17052 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3 >Reporter: Dhiraj Hegde >Assignee: Dhiraj Hegde >Priority: Major > Attachments: read_failure.log, write_failure1.log, write_failure2.log > > > Hadoop components are increasingly being deployed on VMs and containers. One > aspect of this environment is that DNS is dynamic. Hostname records get > modified (or deleted/recreated) as a container in Kubernetes (or even VM) is > being created/recreated. In such dynamic environments, the initial DNS > resolution request might return resolution failure briefly as DNS client > doesn't always get the latest records. This has been observed in Kubernetes > in particular. In such cases NetUtils.connect() appears to throw > java.nio.channels.UnresolvedAddressException. In much of Hadoop code (like > DFSInputStream and DFSOutputStream), the code is designed to retry > IOException. However, since UnresolvedAddressException is not child of > IOException, no retry happens and the code aborts immediately. It is much > better if NetUtils.connect() throws java.net.UnknownHostException as that is > derived from IOException and the code will treat this as a retry-able error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
[ https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhiraj Hegde updated HADOOP-17052: -- Attachment: (was: stack_trace2) > NetUtils.connect() throws an exception the prevents any retries when hostname > resolution fails > -- > > Key: HADOOP-17052 > URL: https://issues.apache.org/jira/browse/HADOOP-17052 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3 >Reporter: Dhiraj Hegde >Assignee: Dhiraj Hegde >Priority: Major > > Hadoop components are increasingly being deployed on VMs and containers. One > aspect of this environment is that DNS is dynamic. Hostname records get > modified (or deleted/recreated) as a container in Kubernetes (or even VM) is > being created/recreated. In such dynamic environments, the initial DNS > resolution request might return resolution failure briefly as DNS client > doesn't always get the latest records. This has been observed in Kubernetes > in particular. In such cases NetUtils.connect() appears to throw > java.nio.channels.UnresolvedAddressException. In much of Hadoop code (like > DFSInputStream and DFSOutputStream), the code is designed to retry > IOException. However, since UnresolvedAddressException is not child of > IOException, no retry happens and the code aborts immediately. It is much > better if NetUtils.connect() throws java.net.UnknownHostException as that is > derived from IOException and the code will treat this as a retry-able error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17052) NetUtils.connect() throws an exception the prevents any retries when hostname resolution fails
[ https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhiraj Hegde updated HADOOP-17052: -- Attachment: stack_trace2 > NetUtils.connect() throws an exception the prevents any retries when hostname > resolution fails > -- > > Key: HADOOP-17052 > URL: https://issues.apache.org/jira/browse/HADOOP-17052 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3 >Reporter: Dhiraj Hegde >Assignee: Dhiraj Hegde >Priority: Major > Attachments: stack_trace2 > > > Hadoop components are increasingly being deployed on VMs and containers. One > aspect of this environment is that DNS is dynamic. Hostname records get > modified (or deleted/recreated) as a container in Kubernetes (or even VM) is > being created/recreated. In such dynamic environments, the initial DNS > resolution request might return resolution failure briefly as DNS client > doesn't always get the latest records. This has been observed in Kubernetes > in particular. In such cases NetUtils.connect() appears to throw > java.nio.channels.UnresolvedAddressException. In much of Hadoop code (like > DFSInputStream and DFSOutputStream), the code is designed to retry > IOException. However, since UnresolvedAddressException is not child of > IOException, no retry happens and the code aborts immediately. It is much > better if NetUtils.connect() throws java.net.UnknownHostException as that is > derived from IOException and the code will treat this as a retry-able error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org