[ 
https://issues.apache.org/jira/browse/HDFS-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268463#comment-17268463
 ] 

Benoit Sigoure commented on HDFS-15250:
---------------------------------------

This issue is not fixed. Merely catching and logging the exception just to 
re-raise it doesn't solve the problem. The exception ends up preventing HDFS 
reads from succeeding when one of the replicas is unavailable due to 
UnresolvedAddressException, even though there could be other replicas available.

> Setting `dfs.client.use.datanode.hostname` to true can crash the system 
> because of unhandled UnresolvedAddressException
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15250
>                 URL: https://issues.apache.org/jira/browse/HDFS-15250
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ctest
>            Assignee: Ctest
>            Priority: Major
>             Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
>         Attachments: HDFS-15250-001.patch, HDFS-15250-002.patch
>
>
> *Problem:*
> `dfs.client.use.datanode.hostname` by default is set to false, which means 
> the client will use the IP address of the datanode to connect to the 
> datanode, rather than the hostname of the datanode.
> In `org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer`:
>  
> {code:java}
>  try {
>    Peer peer = remotePeerFactory.newConnectedPeer(inetSocketAddress, token,
>        datanode);
>    LOG.trace("nextTcpPeer: created newConnectedPeer {}", peer);
>    return new BlockReaderPeer(peer, false);
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>        + "{}", datanode);
>    throw e;
>  }
> {code}
>  
> If `dfs.client.use.datanode.hostname` is false, then it will try to connect 
> via IP address. If the IP address is illegal and the connection fails, 
> IOException will be thrown from `newConnectedPeer` and be handled.
> If `dfs.client.use.datanode.hostname` is true, then it will try to connect 
> via hostname. If the hostname cannot be resolved, UnresolvedAddressException 
> will be thrown from `newConnectedPeer`. However, UnresolvedAddressException 
> is not a subclass of IOException so `nextTcpPeer` doesn’t handle this 
> exception at all. This unhandled exception could crash the system.
>  
> *Solution:*
> Since the method is handling the illegal IP address, then the illegal 
> hostname should be also handled as well. One solution is to add the handling 
> logic in `nextTcpPeer`:
> {code:java}
>  } catch (IOException e) {
>    LOG.trace("nextTcpPeer: failed to create newConnectedPeer connected to"
>        + "{}", datanode);
>    throw e;
>  } catch (UnresolvedAddressException e) {
>    ... // handling logic 
>  }{code}
> I am very happy to provide a patch to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to