Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Wols Lists
On 23/06/20 19:32, Sid Spry wrote:
> The danger of SMART is that rate of false negatives is so high (IME) that
> you might erroneously think a drive is not going to fail and putting off a
> backup. A good backup policy should mitigate this, but you still might plan
> around drive lifetime SMART predicts before realizing they are or can be
> bad predictions.

Thing is, SMART is best at predicting PLATTER FAILURE. Expecting it to
tell you that other parts of the drive are going to fail is like
expecting a car mechanic to inspect your plumbing :-)

And it's probably the marketeers closing the stable door after the horse
has bolted - platters USED to be the least reliable part of the drive.
Now we've got lead-free eco-solder it's the PCB that's most likely to
fail ...

Cheers,
Wol



Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Rich Freeman
On Tue, Jun 23, 2020 at 3:37 PM Dale  wrote:
>
> I'm sure there is many false positives out there but ignoring the real 
> positives isn't a good solution either.  By all means, if one wants to just 
> wing it and hope for the best, disable SMART and take the risk.  At some 
> point, a drive will fail and without SMART, likely with no warning at all, 
> not even a false one.  ;-)
>

Agree in general, but your best practice is to be in a position where
you don't care if a drive fails without warning.  Of course, warning
might be nice so that you can go ahead and start replacing it or
ordering spares.

-- 
Rich



Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Dale
Sid Spry wrote:
>
> On Tue, Jun 23, 2020, at 12:26 PM, Dale wrote:
>>  SMART can't predict the future so it can only monitor for the things 
>> it can see. If say a spindle bearing is about to lock up suddenly, 
>> SMART most likely can't detect that since it is a hardware failure that 
>> can't really be predicted. We may be able to hear a strange sound if we 
>> lucky but if it happens suddenly, it may not even do that. While SMART 
>> can't predict all points of failures, it can detect a lot of them. Even 
>> if the two drives I had failed with no warning from SMART, I'd still 
>> run it and monitor it. Using SMART can warn you in certain situations. 
>> If a person doesn't run SMART, they will miss those warnings. 
>>
>>  SMART isn't perfect but it is better than not having it all. 
>>
> Well, in theory SMART should be able to predict hardware failures like
> that through N-th order effects that percolate up to read and write
> statistics. In practice it seems to be guessing badly.
>
> The danger of SMART is that rate of false negatives is so high (IME) that
> you might erroneously think a drive is not going to fail and putting off a
> backup. A good backup policy should mitigate this, but you still might plan
> around drive lifetime SMART predicts before realizing they are or can be
> bad predictions.
>
>


Thing is, drives fail at some point.  SMART, while not perfect, can
detect problems that indicate a failure.  Let's say for example a person
because of the false positives decides not to run SMART at all.  What
are they going to use to figure out if a drive is working like it
should?  Is a drive having problems reading, writing or noticing corrupt
data that is a sign of a problem?  Is it about to fail somehow?  It's
not like there is really any other tool that does this.  if one doesn't
use the tool, they can have a failure that they could have been warned
about and not lose data or very little data.  If a person runs it tho,
at least they have something that can detect some failures and prevent
data loss. 

It's safer to run SMART and get notified when it detects a problem than
it is to not run it and have no way of knowing there is a problem at
all.  Sure, backups are something everyone should do for important
data.  I have backups here, multiple backups of some data.  Still, I run
SMART and pay attention to the emails it sends when something is not
right.  In the past, it has saved me from data loss. 

I'm sure there is many false positives out there but ignoring the real
positives isn't a good solution either.  By all means, if one wants to
just wing it and hope for the best, disable SMART and take the risk.  At
some point, a drive will fail and without SMART, likely with no warning
at all, not even a false one.  ;-)

Dale

:-)  :-) 


Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Sid Spry



On Tue, Jun 23, 2020, at 12:26 PM, Dale wrote:
>  SMART can't predict the future so it can only monitor for the things 
> it can see. If say a spindle bearing is about to lock up suddenly, 
> SMART most likely can't detect that since it is a hardware failure that 
> can't really be predicted. We may be able to hear a strange sound if we 
> lucky but if it happens suddenly, it may not even do that. While SMART 
> can't predict all points of failures, it can detect a lot of them. Even 
> if the two drives I had failed with no warning from SMART, I'd still 
> run it and monitor it. Using SMART can warn you in certain situations. 
> If a person doesn't run SMART, they will miss those warnings. 
> 
>  SMART isn't perfect but it is better than not having it all. 
> 

Well, in theory SMART should be able to predict hardware failures like
that through N-th order effects that percolate up to read and write
statistics. In practice it seems to be guessing badly.

The danger of SMART is that rate of false negatives is so high (IME) that
you might erroneously think a drive is not going to fail and putting off a
backup. A good backup policy should mitigate this, but you still might plan
around drive lifetime SMART predicts before realizing they are or can be
bad predictions.



Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Dale
Sid Spry wrote:
> On Tue, Jun 23, 2020, at 11:38 AM, Grant Edwards wrote:
>> Which is better than not knowing until the drive is failed and
>> offline. :)
>>
> But redundant if the drive degration is obvious. In two cases I
> can think of drives only reported SMART will-fail after the drives
> had hard failed. In the other cases performance was so degraded
> it was obvious it was the drive.
>
>


I've had two hard drive failures that SMART warned me about.  If not for
SMART I wouldn't have noticed the drives having issues until much
later.  Maybe even after losing a lot of data.  In both of those cases,
I lost no data at all.  I was able to recover everything off the drive. 

SMART can't predict the future so it can only monitor for the things it
can see.  If say a spindle bearing is about to lock up suddenly, SMART
most likely can't detect that since it is a hardware failure that can't
really be predicted.  We may be able to hear a strange sound if we lucky
but if it happens suddenly, it may not even do that.  While SMART can't
predict all points of failures, it can detect a lot of them.  Even if
the two drives I had failed with no warning from SMART, I'd still run it
and monitor it.  Using SMART can warn you in certain situations.  If a
person doesn't run SMART, they will miss those warnings. 

SMART isn't perfect but it is better than not having it all. 

Dale

:-)  :-) 


Re: [gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Sid Spry
On Tue, Jun 23, 2020, at 11:38 AM, Grant Edwards wrote:
> 
> Which is better than not knowing until the drive is failed and
> offline. :)
> 

But redundant if the drive degration is obvious. In two cases I
can think of drives only reported SMART will-fail after the drives
had hard failed. In the other cases performance was so degraded
it was obvious it was the drive.



[gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-23 Thread Grant Edwards
On 2020-06-23, Sid Spry  wrote:

> Thanks for these. I do have a general question: has SMART actually shown
> anyone predictive capability?

Sort of.  It noticed the initial failures and e-mailed me a warning
long before I would have otherwised noticed.  I lost a couple files,
but without SMART I probably wouldn't have noticed any failures for a
long time (probably not until something fairly catastrophic happened).

> In my use and in the use of 4-5 people I know it only makes you
> aware of errors after the drive is failing but still online.

Which is better than not knowing until the drive is failed and
offline. :)

--
Grant




[gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-15 Thread Grant Edwards
On 2020-06-15, Grant Edwards  wrote:

> backblocks was designed to do what you want.
...
> babblocks would be a good start.

Geez, I can't even mistype "badblocks" consistently...

--
Grant





[gentoo-user] Re: Testing a used hard drive to make SURE it is good.

2020-06-15 Thread Grant Edwards
On 2020-06-15, Dale  wrote:
> Howdy,
>
> I finally bought a 8TB drive.  It is used but they claim only a
> short duration.  Still, I want to test it to be sure it is in grade
> A shape before putting a lot of data on it and depending on it.  I
> am familiar with some tools already.  I know about SMART but it is
> not always 100%.   It seems to catch most problems but not all.  I'm
> familiar with dd and writing all zeores or random to it to see if it
> can in fact write to all the parts of the drive but it is slow.

It takes a long time to write 8GB no matter what tool you're using.

> It can take a long time to write and fill up a 8TB drive. Days
> maybe?

I would guess several days

> I googled and found a new tool but not sure how accurate it is since
> I've never used it before.  The command is badblocks.  It is
> installed on my system so I'm just curious as to what it will catch
> that others won't.  Is it fast or slow like dd?

backblocks was designed to do what you want.  For an 8GB drive, it
will probably take most of a week.

> I plan to run the SMART test anyway.  It'll take several hours but
> I'd like to run some other test to catch errors that SMART may
> miss.  If there is such a tool that does that.  If you bought a used
> drive, what would you run other than the long version of SMART and
> its test?

babblocks would be a good start.

you could also use stress-ng with the "hdd" options:

https://packages.gentoo.org/packages/app-benchmarks/stress-ng
https://wiki.gentoo.org/wiki/User:Maffblaster/Drafts/stress-ng

--
Grant