> Date: Tue, 3 Aug 2010 11:02:48 -0400
> Subject: Re: Backing up HDFS
> From: [email protected]
> To: [email protected]
>
> Assuming you are taking the distcp approach you can mirror your
> cluster with some scripting/coding. However your destination systems
> can be more modest, assuming you wish to use it ONLY for data no job
> processing:
>
And that would be a waste. (Why build a cloud just to store data and not do any
processing?)
You're not building your cloud in a vacuum. There are going to be SAN(s), other
servers, tape??? available. The trick is getting the important data off the
cloud to a place where it can be backed up via the corporation's standard IT
practices.
Because of the size of data, you may see people pulling data off the cloud in
to a SAN, then to either a tape drive or a SATA Hot Swap Drive for off site
storage.
It all depends on the value of the data.
Again, YMMV
HTH
-Mike