erik quanstrom wrote:
Various studies seem to indicate failure rates are highly
correlated with drive model, vintage and manufacturer.
Assuming a RAID is built from similar disks, when one fails
the others are more likely to fail.

while it is true that some disks vintages are better than others, when
one drive fails, the probability of the other drives failing has not
changed.  this is the same as if you flip a coin ten times and get ten
heads, the probability of flipping the same coin and getting heads, is
still 1/2.

i think this corelation gives people the false impression that they do
fail en masse, but that's really wrong.  the latent errors probablly
happened months ago.
Yes but if there are many latent errors and/or the error rate
is going up it is time to replace it.

maybe.  the goggle paper you cited didn't find a strong correlation
between smart errors (including block relocation) and failure.

This is a good idea.  We did this in 1983, back when disks
were simpler beasts.  No RAID then of course.

even a better idea back then.  disks didn't have 1/4 million
lines of firmware relocating blocks and doing other things to^w
i mean for you.

- erik



And - lest we forget - a RAID array actually has a higher statistical chance of failure, and a *lower* MTBF than a single drive. Simple math.

What we gain is a reduced risk of *unrecoverable* damage, not fewer failures, per se.

Bill



Reply via email to