[ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288606#comment-14288606
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7392:
-------------------------------------------

{code}
+          if(server != null) {
+            Short s = timeoutFailuresByAddress.get(server);
+            if(s != null)
+              timeoutFailures = s.shortValue();
+            timeoutFailures++;
+            timeoutFailuresByAddress.put(server, timeoutFailures);
           }
-          handleConnectionTimeout(timeoutFailures++,
+          updateAddress();
+          handleConnectionTimeout(timeoutFailures,
{code}
- Could server be null?  It seems impossible so that we should omit the null 
check.
- Before the patch, timeoutFailures is incremented after 
handleConnectionTimeout(..).  We need to keep the order.

> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> ---------------------------------------------------------------------
>
>                 Key: HDFS-7392
>                 URL: https://issues.apache.org/jira/browse/HDFS-7392
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>            Reporter: Frantisek Vacek
>            Assignee: Daniel Pesch
>         Attachments: 1.png, 2.png, HDFS-7392.diff
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.example.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> {quote}
> [~/proj/quickbox]$ nslookup share.example.com
> Server:         127.0.1.1
> Address:        127.0.1.1#53
> share.example.com canonical name = 
> internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com.
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.223
> Name:   internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com
> Address: 192.168.1.65
> {quote}
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to