Re: migrate cluster to different datacenter

Patrick Angeles Tue, 07 Aug 2012 07:37:52 -0700

It would help to know your data ingest and processing patterns (and any
applicable SLAs).

In most cases, you'd only need to move the raw ingested data, then you can
derive the rest in the other cluster. Assuming that you have some sort of
date-based partitioning on the ingest, then it's easy to define a cut-off
point.

Depending on your read SLAs, you could tee writes to both clusters for a
period of time, or just simply switch off to the new one once the majority
of data has been moved.

Finally, you would want to do a consistency check to make sure everything
made it to the other side... maybe run a checksum on derived data on both
clusters and compare. Something like that...

- P

On Fri, Aug 3, 2012 at 5:19 PM, Patai Sangbutsarakum <
silvianhad...@gmail.com> wrote:

> thanks for response.
> Physical move is not a choice in this case. Purely looking for copying
> data and how to catch up with the update of a file while it is being
> migrated.
>
> On Fri, Aug 3, 2012 at 12:40 PM, Chen He <airb...@gmail.com> wrote:
> > sometimes, physically moving hard drives helps.   :)
> > On Aug 3, 2012 1:50 PM, "Patai Sangbutsarakum" <silvianhad...@gmail.com>
> > wrote:
> >
> >> Hi Hadoopers,
> >>
> >> We have a plan to migrate Hadoop cluster to a different datacenter
> >> where we can triple the size of the cluster.
> >> Currently, our 0.20.2 cluster have around 1PB of data. We use only
> >> Java/Pig.
> >>
> >> I would like to get some input how we gonna handle with transferring
> >> 1PB of data to a new site, and also keep up with
> >> new files that thrown into cluster all the time.
> >>
> >> Happy friday !!
> >>
> >> P
> >>
>

Re: migrate cluster to different datacenter

Reply via email to