Hi all,

We’re in the process of migrating from EC2-Classic to VPC and needed to 
transfer our HDFS data. We setup a new cluster inside the VPC, and assigned the 
name node and data node temporary public IPs. Initially, we had a lot of 
trouble getting the name node to redirect to the public hostname instead of 
private IPs. After some fiddling around, we finally got webhdfs and dfs -cp to 
work using public hostnames. However, distcp simply refuses to use the public 
hostnames when connecting to the data nodes.

We’re running distcp on the old cluster, copying data into the new cluster.

The old hadoop cluster is running 1.0.4 and the new one is running 1.2.1.

So far, on the new cluster, we’ve tried:
- Using public DNS hostnames in the master and slaves files (on both the name 
node and data nodes)
- Setting the hostname of all the boxes to their public DNS name
- Setting “fs.default.name” to the public DNS name of the new name node.

And on both clusters:
- Setting the “dfs.datanode.use.datanode.hostname” and 
“dfs.client.use.datanode.hostname” to “true" on both the old and new cluster.

Even though webhdfs is finally redirecting to data nodes using the public 
hostname, we keep seeing errors when running distcp. The errors are all similar 
to: http://pastebin.com/ZYR07Fvm

What do we need to do to get distcp to use the public hostname of the new 
machines? I haven’t tried running distcp in the other direction (I’m about to), 
but I suspect I’ll run into the same problem.

Thanks!
Jameel

Reply via email to