Re: PATA/SATA Disk Reliability paper

2007-02-27 Thread Bill Davidsen

Stephen C Woods wrote:

   As he leans on his cane, the old codger says
Well Disks used to come in open cannisters,  that is you took the bottom
cover off, and then put the whould pack into the drive, and then
unscrewed the top cover and took it out.. Clearly ventilated.  C 1975.

  Later we got sealed drives, Kennedy 180 MB Winchesters they were
called (the used IBM 3030 technology).  The had a vent pipe with two
filters, you replaced the outer one every 90days (as part of the PM
process).  The inner one you didn't touch.  Aparently they figured that
it'd be a long time before the inner one got really clogged at 10 min
exposure every 90 days.  C 1980

  Still later we had a Mainframe running Un*x, it used IBM 3080 drives
these had huge HDA boxes that wree sealed but hav vent filters that had
to be changed every PM  (30 days,  2 hours of down time to do them
all).  C 1985.

  So drives do need to be ventilated, not so much wory about exploding,
but rather subtle distortion of the case as the atmospheric preasure
changed.

   Doe anyone rememnber that you had to let you drives acclimate to your
machine room for a day or so before you used them.

   Ah the good old days...
 HUH???

  scw
I remember the DSU-10, 16 million 36 bit words of storage, which not 
only wanted to be acclimatized, but had platters so large, over a meter 
in diameter, that ther was a short crane mounting point on the box. 
Failure rate went WAY down after better air filters were installed.


I think they were made for GE by CDC, but never knew for sure. GE was a 
mainframe manufacturer until 1970, their big claim to fame was the 
GE-645, the development platform for MULTICS. They sold the computer 
business, mainframe and industrial control, in 1970 to put money into 
nuclear energy, and haven't built a power plant since. Then the 
developed a personal computer in 1978, built a plant to manufacture it 
in Waynesboro VA, and decided there was no market for a small computer.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-27 Thread Bill Davidsen

Mark Hahn wrote:
In contrast, ever since these holes appeared, drive failures became 
the

norm.


wow, great conspiracy theory!


I think you misunderstand.  I just meant plain old-fashioned 
mis-engineering.


I should have added a smilie.  but I find it dubious that the whole 
industry would have made a major bungle if so many failures are due to 
the hole...


But remember, the google report mentions a great number of drives 
failing for
no apparent reason, not even a smart warning, so failing within the 
warranty

period is just pure luck.


are we reading the same report?  I look at it and see:

- lowest failures from medium-utilization drives, 30-35C.
- higher failures from young drives in general, but especially
if cold or used hard.
- higher failures from end-of-life drives, especially  40C.
- scan errors, realloc counts, offline realloc and probation
counts are all significant in drives which fail.

the paper seems unnecessarily gloomy about these results.  to me, they're
quite exciting, and provide good reason to pay a lot of attention to 
these

factors.  I hate to criticize such a valuable paper, but I think they've
missed a lot by not considering the results in a fully factorial analysis
as most medical/behavioral/social studies do.  for instance, they bemoan
a 56% false negative rate from only SMART signals, and mention that if
40C is added, the FN rate falls to 36%.  also incorporating the 
low-young

risk factor would help.  I would guess that a full-on model, especially
if it incorporated utilization, age, performance could comfortable 
levels. 
The big thing I notice is that drives with SMART errors are quite likely 
to fail, but drives which fail aren't all that likely to have SMART 
errors. So while I might proactively move a drive with errors out or to 
non-critical service, seeing no errors doesn't mean the drive won't fail.


I haven't looked at drive temp vs. ambient, I am collecting what data I 
can, but I no longer have thousands of drives to monitor (I'm grateful).


Interesting speculation: on drives with cyclic load, does spinning down 
off-shift help or hinder? I have two boxes full of WD, Seagate and 
Maxtor drives, all cheap commodity drives, which have about 6.8 years 
power on time, 11-14 power cycles, and 2200-2500 spin-up cycles, due to 
spin down nights and weekends. Does anyone have a large enough 
collection of similar use drives to contribute results?


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-26 Thread Mario 'BitKoenig' Holbe
Al Boldi [EMAIL PROTECTED] wrote:
 Interesting link.  They seem to point out that smart not necessarily warns of 
  
 pending failure.  This is probably worse than not having smart at all, as it 
 gives you the illusion of safety.

If SMART gives you the illusion of safety, you didn't understand SMART.
SMART hints *only* the potential presence or occurence of failures in
the future, it does not prove the absence of such - and nobody ever said
it does. It would even be impossible to do that, though (which is easy
to prove by just utilizing an external damaging tool like a hammer).
Concluding from that that not having any failure detector at all is
better than having at least an imperfect one is IMHO completely wrong.


regards
   Mario
-- 
File names are infinite in length where infinity is set to 255 characters.
-- Peter Collinson, The Unix File System

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-26 Thread Al Boldi
Mario 'BitKoenig' Holbe wrote:
 Al Boldi [EMAIL PROTECTED] wrote:
  Interesting link.  They seem to point out that smart not necessarily
  warns of pending failure.  This is probably worse than not having smart
  at all, as it gives you the illusion of safety.

 If SMART gives you the illusion of safety, you didn't understand SMART.
 SMART hints *only* the potential presence or occurence of failures in
 the future, it does not prove the absence of such - and nobody ever said
 it does. It would even be impossible to do that, though (which is easy
 to prove by just utilizing an external damaging tool like a hammer).
 Concluding from that that not having any failure detector at all is
 better than having at least an imperfect one is IMHO completely wrong.

Agreed.  But would you then call it SMART?  Sounds rather DUMB.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Al Boldi
Mark Hahn wrote:
  In contrast, ever since these holes appeared, drive failures became the
  norm.

 wow, great conspiracy theory!

I think you misunderstand.  I just meant plain old-fashioned mis-engineering.

 maybe the hole is plugged at
 the factory with a substance which evaporates at 1/warranty-period ;)

Actually it's plugged with a thin paper-like filter, which does not seem to 
evaporate easily.

And it's got nothing to do with warranty, although if you get lucky and the 
failure happens within the warranty period, you can probably demand a 
replacement drive to make you feel better.

But remember, the google report mentions a great number of drives failing for 
no apparent reason, not even a smart warning, so failing within the warranty 
period is just pure luck.

 seriously, isn't it easy to imagine a bladder-like arrangement that
 permits equilibration without net flow?  disk spec-sheets do limit
 this - I checked the seagate 7200.10: 10k feet operating, 40k max.
 amusingly -200 feet is the min either way...

Well, it looks like filtered net flow on wd's.

What's it look like on seagate?

 Doe anyone rememnber that you had to let you drives acclimate to
  your machine room for a day or so before you used them.
 
  The problem is, that's not enough; the room temperature/humidity has to
  be controlled too.  In a desktop environment, that's not really
  feasible.

 5-90% humidity, operating, 95% non-op, and 30%/hour.  seems pretty easy
 to me.  in fact, I frequently ask people to justify the assumption that
 a good machineroom needs tight control over humidity.  (assuming, like
 most machinerooms, you aren't frequently handling the innards.)

I agree, but reality has a different opinion, and it may take down that 
drive, specs or no specs.

A good way to deal with reality is to find the real reasons for failure.  
Once these reasons are known, engineering quality drives becomes, thank GOD, 
really rather easy.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Mark Hahn

In contrast, ever since these holes appeared, drive failures became the
norm.


wow, great conspiracy theory!


I think you misunderstand.  I just meant plain old-fashioned mis-engineering.


I should have added a smilie.  but I find it dubious that the whole 
industry would have made a major bungle if so many failures are due to 
the hole...



But remember, the google report mentions a great number of drives failing for
no apparent reason, not even a smart warning, so failing within the warranty
period is just pure luck.


are we reading the same report?  I look at it and see:

- lowest failures from medium-utilization drives, 30-35C.
- higher failures from young drives in general, but especially
if cold or used hard.
- higher failures from end-of-life drives, especially  40C.
- scan errors, realloc counts, offline realloc and probation
counts are all significant in drives which fail.

the paper seems unnecessarily gloomy about these results.  to me, they're
quite exciting, and provide good reason to pay a lot of attention to these
factors.  I hate to criticize such a valuable paper, but I think they've
missed a lot by not considering the results in a fully factorial analysis
as most medical/behavioral/social studies do.  for instance, they bemoan
a 56% false negative rate from only SMART signals, and mention that if 

40C is added, the FN rate falls to 36%.  also incorporating the low-young

risk factor would help.  I would guess that a full-on model, especially
if it incorporated utilization, age, performance could comfortable levels.


The problem is, that's not enough; the room temperature/humidity has to
be controlled too.  In a desktop environment, that's not really
feasible.


5-90% humidity, operating, 95% non-op, and 30%/hour.  seems pretty easy
to me.  in fact, I frequently ask people to justify the assumption that
a good machineroom needs tight control over humidity.  (assuming, like
most machinerooms, you aren't frequently handling the innards.)


I agree, but reality has a different opinion, and it may take down that
drive, specs or no specs.


why do you say this?  I have my machineroom set for 35% (which appears 
to be it's natural point, with a wide 20% margin on either side.

I don't really want to waste cooling capacity on dehumidification,
for instance, unless there's a good reason.


A good way to deal with reality is to find the real reasons for failure.
Once these reasons are known, engineering quality drives becomes, thank GOD,
really rather easy.


that would be great, but depends rather much on relatively small number of 
variables, which are manifest, not hidden.  there are billions of studies

(in medical/behavioral/social fields) which assume large numbers of more
or less hidden variables, and which still manage good success...

regards, mark hahn.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Richard Scobie

Mark Hahn wrote:


this - I checked the seagate 7200.10: 10k feet operating, 40k max.
amusingly -200 feet is the min either way...


Which means you could not use this drive on the shores of the Dead Sea, 
which is at about -1300ft.


Regards,

Richard
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Al Boldi
Mark Hahn wrote:
   - disks are very complicated, so their failure rates are a
   combination of conditional failure rates of many components.
   to take a fully reductionist approach would require knowing
   how each of ~1k parts responds to age, wear, temp, handling, etc.
   and none of those can be assumed to be independent.  those are the
   real reasons, but most can't be measured directly outside a lab
   and the number of combinatorial interactions is huge.

It seems to me that the biggest problem are the 7.2k+ rpm platters 
themselves, especially with those heads flying closely on top of them.  So, 
we can probably forget the rest of the ~1k non-moving parts, as they have 
proven to be pretty reliable, most of the time.

   - factorial analysis of the data.  temperature is a good
   example, because both low and high temperature affect AFR,
   and in ways that interact with age and/or utilization.  this
   is a common issue in medical studies, which are strikingly
   similar in design (outcome is subject or disk dies...)  there
   is a well-established body of practice for factorial analysis.

Agreed.  We definitely need more sensors.

   - recognition that the relative results are actually quite good,
   even if the absolute results are not amazing.  for instance,
   assume we have 1k drives, and a 10% overall failure rate.  using
   all SMART but temp detects 64 of the 100 failures and misses 36.
   essentially, the failure rate is now .036.  I'm guessing that if
   utilization and temperature were included, the rate would be much
   lower.  feedback from active testing (especially scrubbing)
   and performance under the normal workload would also help.

Are you saying, you are content with pre-mature disk failure, as long as 
there is a smart warning sign?

If so, then I don't think that is enough.

I think the sensors should trigger some kind of shutdown mechanism as a 
protective measure, when some threshold is reached.  Just like the 
protective measure you see for CPUs to prevent meltdown.

Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Mark Hahn

and none of those can be assumed to be independent.  those are the
real reasons, but most can't be measured directly outside a lab
and the number of combinatorial interactions is huge.


It seems to me that the biggest problem are the 7.2k+ rpm platters
themselves, especially with those heads flying closely on top of them.  So,
we can probably forget the rest of the ~1k non-moving parts, as they have
proven to be pretty reliable, most of the time.


donno.  non-moving parts probably have much higher reliability, but 
so many of them makes them a concern.  if a discrete resistor has 
a 1e9 hour MTBF, 1k of them are 1e6 and that's starting to approach

the claimed MTBF of a disk.  any lower (or more components) and it
takes over as a dominant failure mode...

the Google paper doesn't really try to diagnose, but it does indicate
that metrics related to media/head problems tend to promptly lead to failure.
(scan errors, reallocations, etc.)  I guess that's circumstantial support
for your theory that crashes of media/heads are the primary failure mode.


- factorial analysis of the data.  temperature is a good
example, because both low and high temperature affect AFR,
and in ways that interact with age and/or utilization.  this
is a common issue in medical studies, which are strikingly
similar in design (outcome is subject or disk dies...)  there
is a well-established body of practice for factorial analysis.


Agreed.  We definitely need more sensors.


just to be clear, I'm not saying we need more sensors, just that the 
existing metrics (including temp and utilization) need to be considered

jointly, not independently.  more metrics would be better as well,
assuming they're direct readouts, not idiot-lights...


and performance under the normal workload would also help.


Are you saying, you are content with pre-mature disk failure, as long as
there is a smart warning sign?


I'm saying that disk failures are inevitable.  ways to reduce the chance
of data loss are what we have to focus on.  the Google paper shows that 
disks like to be at around 35C - not too cool or hot (though this is probably
conflated with utilization.)  the paper also shows that warning signs can 
indicate a majority of failures (though it doesn't present the factorial 
analysis necessary to tell which ones, how well, avoid false-positives, etc.)



I think the sensors should trigger some kind of shutdown mechanism as a
protective measure, when some threshold is reached.  Just like the
protective measure you see for CPUs to prevent meltdown.


but they already do.  persistent bad reads or writes to a block will trigger
its reallocation to spares, etc.  for CPUs, the main threat is heat, and it's 
easy to throttle to cool down.  for disks, the main threat is probably wear, 
which seems quite different - more catastrophic and less mitigatable

once it starts.

I'd love to hear from an actual drive engineer on the failure modes 
they worry about...


regards, mark hahn.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Benjamin Davenport

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Mark Hahn wrote:
| if a discrete resistor has a 1e9 hour MTBF, 1k of them are 1e6

That's not actually true.  As a (contrived) example, consider two cases.

Case 1: failures occur at constant rate from hours 0 through 2e9.
Case 2: failures occur at constant rate from 1e9-10 hours through 1e9+10 hours.

Clearly in the former case, over 1000 components there will almost certainly be
a failure by 1e8 hours.  In the latter case, there will not be.  Yet both have
the same MTTF.


MTTF says nothing about the shape of the failure curve.  It indicates only where
its midpoint is.  To compute the MTTF of 1000 devices, you'll need to know the
probability distribution of failures over time of those 1000 devices, which can
be computed from the distribution of failures over time for a single device.
But, although MTTF is derived from this distribution, you cannot reconstruct the
distribution knowing only MTTF.  In fact, the recent papers on disk failure
indicate that common assumptions about the shape of that distribution (either a
bathtub curve, or increasing failures due to wear-out after 3ish years) do not 
hold.

- -Ben
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF4hHHcsocGMHJ2H8RCqPfAKCYYlcOTW3OKGyJlYdXIRq802US+ACfTaBG
ZzVJSUNyU/htda/JCxWvc4A=
=DouE
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-25 Thread Mark Hahn

| if a discrete resistor has a 1e9 hour MTBF, 1k of them are 1e6

That's not actually true.  As a (contrived) example, consider two cases.


if you know nothing else, it's the best you can do.  it's also a 
conservative estimate (where conservative means to expect a failure sooner).



distribution knowing only MTTF.  In fact, the recent papers on disk failure
indicate that common assumptions about the shape of that distribution (either 
a
bathtub curve, or increasing failures due to wear-out after 3ish years) do 
not hold.


the data in both the Google and SchroederGibson papers are fairly noisy.
yes, the strong bathtub hypthothesis is apparently wrong (that infant
mortality is an exp decreasing failure rate over the first year, that
disks stay at a constant failure rate for the next 4-6 years, then have 
an exp increasing failure rate).


both papers, though, show what you might call a swimming pool curve:
a short period of high mortality (clock starts when the drive leaves 
the factory) with a minimum failure rate at about 1 year.  that's the 
deep end of the pool ;)  then increasing failures out to the end of 
expected service life (warranty period).  what happens after is probably
too noisy to conclude much, since most people prefer not to use disks 
which have already seen the death of ~25% of their peers.  (Google's 
paper has, halleluiah, error bars showing high variance at 3 years.)


both papers (and most people's experience, I think) agree that:
- there may be an infant mortality curve, but it depends on
when you start counting, conditions and load in early life, etc.
- failure rates increase with age.
- failure rates in the prime of life are dramatically higher
than the vendor spec sheets.
- failure rates in senescence (post warranty) are very bad.

after all, real bathtubs don't have flat bottoms!

as for models and fits, well, it's complicated.  consider that in a lot
of environments, it takes a year or two for a new disk array to fill.
so a wear-related process will initially be focused on a small area of 
disk, perhaps not even spread across individual disks.  or consider that

once the novelty of a new installation wears off, people get more worried
about failures, perhaps altering their replacement strategy...
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-24 Thread Mark Hahn

In contrast, ever since these holes appeared, drive failures became the norm.


wow, great conspiracy theory!  maybe the hole is plugged at 
the factory with a substance which evaporates at 1/warranty-period ;)


seriously, isn't it easy to imagine a bladder-like arrangement that 
permits equilibration without net flow?  disk spec-sheets do limit

this - I checked the seagate 7200.10: 10k feet operating, 40k max.
amusingly -200 feet is the min either way...


   Doe anyone rememnber that you had to let you drives acclimate to your
machine room for a day or so before you used them.


The problem is, that's not enough; the room temperature/humidity has to be
controlled too.  In a desktop environment, that's not really feasible.


5-90% humidity, operating, 95% non-op, and 30%/hour.  seems pretty easy
to me.  in fact, I frequently ask people to justify the assumption that 
a good machineroom needs tight control over humidity.  (assuming, like 
most machinerooms, you aren't frequently handling the innards.)

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-23 Thread Al Boldi
Stephen C Woods wrote:
   So drives do need to be ventilated, not so much wory about exploding,
 but rather subtle distortion of the case as the atmospheric preasure
 changed.

I have a '94 Caviar without any apparent holes; and as a bonus, the drive 
still works.

In contrast, ever since these holes appeared, drive failures became the norm.

Doe anyone rememnber that you had to let you drives acclimate to your
 machine room for a day or so before you used them.

The problem is, that's not enough; the room temperature/humidity has to be 
controlled too.  In a desktop environment, that's not really feasible.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-22 Thread Nix
On 20 Feb 2007, Al Boldi outgrape:
 Eyal Lebedinsky wrote:
 Disks are sealed, and a dessicant is present in each to keep humidity
 down. If you ever open a disk drive (e.g. for the magnets, or the mirror
 quality platters, or for fun) then you can see the dessicant sachet.

 Actually, they aren't sealed 100%.  

I'd certainly hope not, unless you like the sound of imploding drives
when you carry one up a mountain.

 On wd's at least, there is a hole with a warning printed on its side:

   DO NOT COVER HOLE BELOW
   V   V  V  V

   o

I suspect that's for air-pressure equalization.

 In contrast, older models from the last century, don't have that hole.

It was my understanding that disks have had some way of equalizing
pressure with their surroundings for many years; but I haven't verified
this so you may well be right that this is a recent thing. (Anyone know
for sure?)

-- 
`In the future, company names will be a 32-character hex string.'
  --- Bruce Schneier on the shortage of company names
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-22 Thread Nix
On 22 Feb 2007, [EMAIL PROTECTED] uttered the following:

 On 20 Feb 2007, Al Boldi outgrape:
 Eyal Lebedinsky wrote:
 Disks are sealed, and a dessicant is present in each to keep humidity
 down. If you ever open a disk drive (e.g. for the magnets, or the mirror
 quality platters, or for fun) then you can see the dessicant sachet.

 Actually, they aren't sealed 100%.  

 I'd certainly hope not, unless you like the sound of imploding drives
 when you carry one up a mountain.

Or even exploding drives. (Oops.)

-- 
`In the future, company names will be a 32-character hex string.'
  --- Bruce Schneier on the shortage of company names
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-22 Thread Stephen C Woods
   As he leans on his cane, the old codger says
Well Disks used to come in open cannisters,  that is you took the bottom
cover off, and then put the whould pack into the drive, and then
unscrewed the top cover and took it out.. Clearly ventilated.  C 1975.

  Later we got sealed drives, Kennedy 180 MB Winchesters they were
called (the used IBM 3030 technology).  The had a vent pipe with two
filters, you replaced the outer one every 90days (as part of the PM
process).  The inner one you didn't touch.  Aparently they figured that
it'd be a long time before the inner one got really clogged at 10 min
exposure every 90 days.  C 1980

  Still later we had a Mainframe running Un*x, it used IBM 3080 drives
these had huge HDA boxes that wree sealed but hav vent filters that had
to be changed every PM  (30 days,  2 hours of down time to do them
all).  C 1985.

  So drives do need to be ventilated, not so much wory about exploding,
but rather subtle distortion of the case as the atmospheric preasure
changed.

   Doe anyone rememnber that you had to let you drives acclimate to your
machine room for a day or so before you used them.

   Ah the good old days...
 HUH???

  scw


On Thu, Feb 22, 2007 at 10:27:43PM +, Nix wrote:
 On 20 Feb 2007, Al Boldi outgrape:
  Eyal Lebedinsky wrote:
  Disks are sealed, and a dessicant is present in each to keep humidity
  down. If you ever open a disk drive (e.g. for the magnets, or the mirror
  quality platters, or for fun) then you can see the dessicant sachet.
 
  Actually, they aren't sealed 100%.  
 
 I'd certainly hope not, unless you like the sound of imploding drives
 when you carry one up a mountain.
 
  On wd's at least, there is a hole with a warning printed on its side:
 
DO NOT COVER HOLE BELOW
V   V  V  V
 
o
 
 I suspect that's for air-pressure equalization.
 
  In contrast, older models from the last century, don't have that hole.
 
 It was my understanding that disks have had some way of equalizing
 pressure with their surroundings for many years; but I haven't verified
 this so you may well be right that this is a recent thing. (Anyone know
 for sure?)
 
 -- 
 `In the future, company names will be a 32-character hex string.'
   --- Bruce Schneier on the shortage of company names
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
-
Stephen C. Woods; UCLA SEASnet; 2567 Boelter hall; LA CA 90095; (310)-825-8614
Unless otherwise noted these statements are my own, Not those of the 
University of California.  Internet mail:[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-20 Thread Al Boldi
Eyal Lebedinsky wrote:
 Disks are sealed, and a dessicant is present in each to keep humidity
 down. If you ever open a disk drive (e.g. for the magnets, or the mirror
 quality platters, or for fun) then you can see the dessicant sachet.

Actually, they aren't sealed 100%.  

On wd's at least, there is a hole with a warning printed on its side:

  DO NOT COVER HOLE BELOW
  V   V  V  V

  o


In contrast, older models from the last century, don't have that hole.

 Al Boldi wrote:
 
  If there is one thing to watch out for, it is dew.
 
  I remember video machines sensing for dew, so do any drives sense for
  dew?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-19 Thread Al Boldi
Richard Scobie wrote:
 Thought this paper may be of interest. A study done by Google on over
 100,000 drives they have/had in service.

 http://labs.google.com/papers/disk_failures.pdf

Interesting link.  They seem to point out that smart not necessarily warns of  
pending failure.  This is probably worse than not having smart at all, as it 
gives you the illusion of safety.

If there is one thing to watch out for, it is dew.

I remember video machines sensing for dew, so do any drives sense for dew?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-19 Thread Eyal Lebedinsky
Disks are sealed, and a dessicant is present in each to keep humidity down.
If you ever open a disk drive (e.g. for the magnets, or the mirror quality
platters, or for fun) then you can see the dessicant sachet.

cheers

Al Boldi wrote:
 Richard Scobie wrote:
 
Thought this paper may be of interest. A study done by Google on over
100,000 drives they have/had in service.

http://labs.google.com/papers/disk_failures.pdf
 
 
 Interesting link.  They seem to point out that smart not necessarily warns of 
  
 pending failure.  This is probably worse than not having smart at all, as it 
 gives you the illusion of safety.
 
 If there is one thing to watch out for, it is dew.
 
 I remember video machines sensing for dew, so do any drives sense for dew?
 
 
 Thanks!
 
 --
 Al

-- 
Eyal Lebedinsky ([EMAIL PROTECTED]) http://samba.org/eyal/
attach .zip as .dat
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATA/SATA Disk Reliability paper

2007-02-19 Thread H. Peter Anvin

Richard Scobie wrote:
Thought this paper may be of interest. A study done by Google on over 
100,000 drives they have/had in service.


http://labs.google.com/papers/disk_failures.pdf



Bastards:

Failure rates are known to be highly correlated with drive
models, manufacturers and vintages [18]. Our results do
not contradict this fact. For example, Figure 2 changes
significantly when we normalize failure rates per each
drive model. Most age-related results are impacted by
drive vintages. However, in this paper, we do not show a
breakdown of drives per manufacturer, model, or vintage
due to the proprietary nature of these data.

-hpa
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html