On Aug 10, 2010, at 10:54 AM, Bill Graham wrote:
> Is is correct to say that that would work fine? We have a replication factor
> of 2, so we'd be copying twice as much data as we'd need to so I'm sure
> there's a more efficient approach.

It should work fine.  But yes, highly inefficient.

> What about adding the new nodes in the new colo to the existing cluster,
> rebalancing and then decommissioning the old cluster nodes before finally
> migrating the NN/SNN? I know Hadoop isn't intended to run cross-colo, but
> would this be a more efficient approach than the one above?

If you can keep both grids up at the same time, use distcp to do the copy.  
This will make sure the blocks get copied once, will keep permissions with -p, 
keep the replication factor, redistribute data (free balancing!), etc, etc, etc.




Reply via email to