On Aug 3, 2010, at 9:12 AM, Eric Sammer wrote:

<snip/>
> 
> All of that said, what you're protecting against here is permanent loss of a
> data center and human error. Disk, rack, and node level failures are already
> handled by HDFS when properly configured.

You've forgotten a third cause of loss: undiscovered software bugs.

The downside of spinning disks is one completely fatal bug can destroy all your 
data in about a minute (at my site, I famously deleted about 100TB in 10 
minutes with a scratch-space cleanup script gone awry.  That was one nasty 
bug).  This is why we keep good backups.

If you're very, very serious about archiving and have a huge budget, you would 
invest a few million into a tape silo at multiple sites, flip the 
write-protection tab on the tapes, eject them, and send them off to secure 
facilities.  This isn't for everyone though :)

Brian

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to