Dan,
Here's quick and dirty solution that works.
I'm assuming that your cloud is part of a larger corporate network and that you
have your cloud, and then 'cloud aware machines', machines that have hadoop
installed, but are not part of your cloud but are where you launch jobs and
applications from... These machines also have file system mounts to SANs or
other network attached (fiber channel attached) storage.
Step 1 make a copy of the files that you want to backup in to a separate
directory on HDFS
Step 2 from a 'cloud aware machine' that has SAN disk...
use the hadoop fs -copyToLocal <file name>(s) where local disk is on the
SAN
Now let your normal backup policy take over. (Assuming that you have a policy
for backing up data stored on the SAN)
I saw Eric's post about a second Cloud. Not always possible and not always a
good idea if all you want to do is to back up data sets for remote storage.
Note the following:
Performance will vary based on the number of data sets and sizes of the data
sets you want to store.
HTH
-Mike
> Date: Tue, 3 Aug 2010 06:54:41 -0700
> From: [email protected]
> To: [email protected]
> Subject: Backing up HDFS
>
>
> So I am administering a 10+ node hadoop cluster and everything is going
> swimmingly. Unfortunately, some relatively critical data is now being
> stored on the cluster and I am being asked to create a backup solution for
> hadoop in case of catasrophic failure of the data center, the application
> creating data corruption, and ultimately my company wants that warm fuzzy
> feeling that only an offsite backup can provide.
>
> So does anyone else actually backup HDFS? After a quick google and forum
> search I found the following link that creates a full backup and then
> incremental backups, anyone use this or something similar?
>
> http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/
> http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/
>
> Thanks in advance.
> --
> View this message in context:
> http://old.nabble.com/Backing-up-HDFS-tp29335698p29335698.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>