Distcp is the simplest approach you can use (it will copy data parallely
using mappers and reducers).


On Thu, Mar 14, 2013 at 12:16 PM, Vinod Kumar Vavilapalli <
[email protected]> wrote:

>
> Copy data into one of the clusters using distcp *without* downtime
> (assuming you have enough capacity) and then merge the clusters?
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Mar 13, 2013, at 9:38 PM, Shashank Agarwal wrote:
>
> Hey Guys,
>
> I have two different hadoop clusters in production. One cluster is used as
> backing for HBase and the other for other things. Both hadoop clusters are
> using the same version 1.0 and I want to merge them and make them one. I
> know, one possible solution is to copy the data across, but the data is
> really huge on these clusters and it will hard for me to compromise with
> huge downtime.
> Is there any optimal way to merge two hadoop clusters.
>
> ~Shashank
>
>
>


-- 







Thanks and Regards,

VIVEK KOUL

Reply via email to