On Sun, Oct 18, 2009 at 9:42 AM, Bruce Bostwick <lihan161...@sbcglobal.net> wrote: > > On Oct 18, 2009, at 12:25 AM, Max Battcher wrote: > >> On 10/18/2009 0:38, John Williams wrote: >>> >>> On Sat, Oct 17, 2009 at 8:36 PM, Julia Thompson<f...@zurg.net> wrote: >>>> >>>> Er. In that sort of a situation, I myself would set up a RAID for >>>> storing >>>> the data, *much* less chance for losing it. >>> >>> RAID does not protect from rm -rf / , which (some variant of) is my >>> guess at what happened. Although now they are saying most of the data >>> is recovered, so maybe it got munged in a reversible way. >> >> Any "cloud" service at this point is going to be tens, if not hundreds, of >> servers. (Major services easily run in the thousands of servers, and if you >> count "virtual" servers the biggest services are using millions of servers >> already.) At this point any outage that is going to affect a service as >> whole is generally going to be a lot subtler (and possibly a lot "nastier", >> such an accidental viral infection due to an underlying bug/exploit in the >> service) than a rm -rf /. >> >> At least, assuming the system admins are doing their jobs correctly rm -rf >> / to a single server is extremely unlikely to cause massive outage or >> damage... (As a service gets large enough hard drives are expected to fail >> randomly, and surprisingly frequently, and services should be designed >> around that problem...) > > And, as with a RAID except on a much larger scale, there's built in > redundancy and error correction, so the system tends to self-heal. About > the only threat is viral mechanisms that propagate through the system.
Never underestimate the power of human error. As this debacle demonstrates. _______________________________________________ http://mccmedia.com/mailman/listinfo/brin-l_mccmedia.com