[ 
https://issues.apache.org/jira/browse/HDFS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049420#comment-14049420
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6616:
-------------------------------------------

I think it is not a good idea to do random since it will lose data locality, 
especially, when both the client and the data are in the same host.

We probably should support exclude nodes in WebHDFS.

> bestNode shouldn't always return the first DataNode
> ---------------------------------------------------
>
>                 Key: HDFS-6616
>                 URL: https://issues.apache.org/jira/browse/HDFS-6616
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>            Reporter: zhaoyunjiong
>            Assignee: zhaoyunjiong
>            Priority: Minor
>         Attachments: HDFS-6616.patch
>
>
> When we are doing distcp between clusters, job failed:
> 014-06-30 20:56:28,430 INFO org.apache.hadoop.tools.DistCp: FAIL 
> part-r-00101.avro : java.net.NoRouteToHostException: No route to host
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at 
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
>       at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
>       at org.apache.hadoop.hdfs.HftpFileSystem.open(HftpFileSystem.java:322)
>       at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
>       at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:419)
>       at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
>       at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> The root reason is one of the DataNode can't access from outside, but inside 
> cluster, it's health.
> In NamenodeWebHdfsMethods.java:bestNode, it always return the first DataNode, 
> so even after the distcp retries, it still failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to