So I am administering a 10+ node hadoop cluster and everything is going swimmingly. Unfortunately, some relatively critical data is now being stored on the cluster and I am being asked to create a backup solution for hadoop in case of catasrophic failure of the data center, the application creating data corruption, and ultimately my company wants that warm fuzzy feeling that only an offsite backup can provide.
So does anyone else actually backup HDFS? After a quick google and forum search I found the following link that creates a full backup and then incremental backups, anyone use this or something similar? http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/ http://blog.rapleaf.com/dev/2009/06/05/backing-up-hadoops-hdfs/ Thanks in advance. -- View this message in context: http://old.nabble.com/Backing-up-HDFS-tp29335698p29335698.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
