I am trying to run distcp between 2 hdfs clusters and I am getting a
ConnectException .
The Command I am trying to run is of the form (considering clusters hadoop-A &
hadoop-B):
./hadoop distcp -update hdfs://hadoop-A:8020/dira/part-r-00000
hdfs://hadoop-B:8020/dirb/
and I am running it on the destination cluster (hadoop-B).
The stacktrace for the exception is :
Copy failed: java.net.ConnectException: Call to hadoop-A/10.0.173.11:8020
failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:214)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
I have tried ping and ssh in between these clusters and it is working fine. On
both clusters namenode is running as the same user.
The strange part is if I try the command on a single cluster (both source &
destination are on same DFS- so its a simple copy), it still fails with the
same exception. So I mean I run the command below on cluster A.
./hadoop distcp -update hdfs://hadoop-A:8020/dir1/part-r-00000
hdfs://hadoop-A:8020/dir2/
Is there anything else I need to have to get the distcp working? Any hints on
what I could check will be helpful.
Thanks,
Deepika