I have a Amazon cluster which is using HDFS (not S3). Is it possible to use distcp to copy file from a HDFS running on Amazon to another cluster? The other cluster is not running on Amazon. It doesn't look like this is possible because the namenode gets configured with a private IP address which is not accessible from outside the cluster. Does anybody know a way around the problem?
Thanks Bob