On Apr 6, 2009, at 9:49 PM, Mithila Nagendra wrote:
Hey all
I'm trying to connect two separate Hadoop clusters. Is it possible
to do so?
I need data to be shuttled back and forth between the two clusters.
Any
suggestions?
You should use hadoop distcp. It is a map/reduce program that copies
data, typically from one cluster to another. If you have the hftp
interface enabled, you can use that to copy between hdfs clusters that
are different versions.
hadoop distcp hftp://namenode1:1234/foo/bar hdfs://foo/bar
-- Owen