Replication only protects against single node failure. If there's a fire and we lose the whole cluster, replication doesn't help. Or if there's human error and someone accidentally deletes data, then it's deleted from all the replicas. We want our backups to protect against all these scenarios.

On Feb 9, 2009, at 4:41 PM, Amandeep Khurana wrote:

Why would you want to have another backup beyond HDFS? HDFS itself
replicates your data so if the reliability of the system shouldnt be a
concern (if at all it is)...

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Feb 9, 2009 at 4:17 PM, Nathan Marz <[email protected]> wrote:

How do people back up their data that they keep on HDFS? We have many TB of
data which we need to get backed up but are unclear on how to do this
efficiently/reliably.


Reply via email to