On 5/11/07, Lennart Sorensen <[EMAIL PROTECTED]> wrote:

I have no idea what kinds of disks you use but I haven't seen drives
fail very often.  Well not since I stopped dealing with IBM/Seagate SCSI
drives.

How about raid6 then?



I use Seagate, Hitachi and Maxtor.  They all have various levels of suckage
depending on the models and production runs.

Don't take my word for it.  There were 3 papers that came out at FAST this
year although maybe it was 2 at FAST and another one from somewhere else.
One was Carnegie Mellon, the other is one from Google.

They are a good read, highly recommended.

http://www.usenix.org/events/fast07/tech/schroeder.html

http://www.usenix.org/events/fast07/tech/pinheiro.html


How many thousands of machines do you deal with?



1,500 machines online, 4 drives/machine, a mix of PATA and SATA. 4,000 hard
drives in cold storage. 500 hard-drives in boxes awaiting possible recovery.


2 boxes with 4x500GB disks should cost close to $3K.  Mirror the data, the
> services, etc... and sleep easy at night.

And how do you keep machines mirrored constantly?  Having raid5 or 6 at
least means a single disk failure won't take down the machine and force
you to start up somewhere else.



Well, you can try and be fancy using drbd, or you can try and be fancier and
do it at the application layer with rsync or your own smarts, i.e. Googles
DFS or the Open Source implementation in Hadoop.

Basically an exercise for the reader.

HOWEVER, your point is well taken.  There is a difference between  archival
storage and production storage.  I wouldn't have a problem using RAID5 or
RAID6 on a production machine that had a derivative copy of the golden
data.  It can give you huge performance wins under certain loads.  Backups,
or original copies of the data are not something I would put on RAID,
probably ever.

It's not so much that running a few servers here or there with RAID
controllers and drives from various manufactures don't run OK for time X,
it's that statistically speaking, that "OK"ness isn't good enough.

Tape is lame and dead, so that's right out.  That leaves disk.  If there
were an earthquake and your servers tumble over and their drives spill all
over the place, I like the idea of walking into the pile and having some
hope that a single drive has some amount of readable useful data on it.  If
I was using RAID5/6, then it would be jumblies of parity bits and useless
controller/RAID crap.

Again, it's a stronger case for archival storage or of your only copy.  i.e.
at home, my mp3 collection could be sitting on RAID5, but it's not, it's on
RAID1, so each disk is useful on it's own.

Sincerely Off Topic with apologies for that,
Joerg

Reply via email to