Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread Akemi Yagi
On Wed, Sep 10, 2008 at 9:41 PM, MHR [EMAIL PROTECTED] wrote:
 I decided, after the last discussion of smartd and S.M.A.R.T. disks,
 to take a look in my /var/log/messages, and I'm seeing fair bit of
 this:

 Sep 10 20:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
 Offline uncorrectable sectors
 Sep 10 20:41:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently
 unreadable (pending) sectors
(snip)
 Google is not particularly informative on this subject - anyone know
 more than general suggestions about dd, badblocks, etc.?  This is my
 boot and primary system disk (has been for some time), but the error
 message is essentially meaningless (to me, right now).

You should start thinking of replacing the disk.  There is a
discussion in the forum:

http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15880forum=39

I am one of the people there who were getting the same error and
replaced the disk.

Akemi / toracat
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread Anne Wilson
On Thursday 11 September 2008 08:02:23 Akemi Yagi wrote:
 You should start thinking of replacing the disk.  There is a
 discussion in the forum:

 http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15880forum=39

 I am one of the people there who were getting the same error and
 replaced the disk.

I had similar messages on this laptop.  Acer accepted liability and replaced 
the disk.

Anne


signature.asc
Description: This is a digitally signed message part.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread MHR
On Wed, Sep 10, 2008 at 10:14 PM, nate [EMAIL PROTECTED] wrote:

 Download the manufacturer's tools and run a diagnostics on it,
 it will tell you the truth about what's going on.

 I wouldn't trust any generic OS tools over the manufacturer's tools,
 there was a discussion on this topic on this list I think not too
 long ago. The biggest gotcha with the vendor tools though is
 they are usually limited in the types of disk controllers they
 support.


I was going to laugh this off 'cuz how many manufacturers support
Linux, but I was pleasantly surprised, twice, when I found that a)
Seagate does and b) the seatools for Linux produced no errors on the
long test.

It also told me lots of interesting information that I don't recall at
the moment, not the least of which was that the drive does not support
DST (the on-board diagnostics test), which I thought was odd.

Based on some of the other responses, I think I'll run smartctl to see
what it says, but that still doesn't really answer the question about
the number (4294967295 which happens to be ).  There are only
a little over 5 billion sectors on the disk in total - how could 4.3
billion of them be bad?

I'm thinking it's more likely a 32-bit v. 64-bit issue, but I haven't
finished looking at that yet.

One other thing that I find interesting: the drives that are showing
smart errors are /dev/hdb and /dev/sda.  In order from oldest to
newest, my drives are:

/dev/hdb - Maxtor 120GB PATA
/dev/hda - Maxtor 160GB PATA
/dev/sda - Seagate 300GB SATA
/dev/sdb - WD 320GB SATA

The older of each of the PATA and SATA drives are the ones showing the
errors

Thanks.

mhr
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread MHR
On Thu, Sep 11, 2008 at 12:02 AM, Akemi Yagi [EMAIL PROTECTED] wrote:

 You should start thinking of replacing the disk.

I am, thanks.

 There is a discussion in the forum:

 http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15880forum=39

 I am one of the people there who were getting the same error and
 replaced the disk.


Scary stuff, to some extent.  I should probably point out that the
Maxtor 120GB PATA drive (the one with errors I believe are real) had a
power connector problem for a while that may have damaged it, but I
haven't seen anything funny with it since, and that was back when I
changed the CPU/MB in March, 2007 and then the power supply about a
month later when that burned out altogether.

I'm watchinc it now!

mhr
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread MHR
On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
[EMAIL PROTECTED] wrote:

 I had similar messages on this laptop.  Acer accepted liability and replaced
 the disk.

I'm pretty sure the Seagate warranty is no longer in force - most of
them are a year.  Maxtor's are, for sure - I've had to go through that
once before, and they were quite cooperative, too, but that was a few
years back (before Seagate bought them).

mhr
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread nate
MHR wrote:

 I was going to laugh this off 'cuz how many manufacturers support
 Linux, but I was pleasantly surprised, twice, when I found that a)
 Seagate does and b) the seatools for Linux produced no errors on the
 long test.

I wasn't aware there was a seatools for Linux, I meant to refer to
the bootable versions of the tools that run outside of any OS.

But perhaps the vendor tools have improved and can reliably detect
faults from within an OS, it's been several years since I've had to
use them.

nate

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread John R Pierce

MHR wrote:

On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
[EMAIL PROTECTED] wrote:
  

I had similar messages on this laptop.  Acer accepted liability and replaced
the disk.



I'm pretty sure the Seagate warranty is no longer in force - most of
them are a year.  Maxtor's are, for sure - I've had to go through that
once before, and they were quite cooperative, too, but that was a few
years back (before Seagate bought them).
  



depending on the drive and how it was sold, Seagate drives can have a 3 
or even 5 year warranty.


OTOH,  major OEM stuff sold embedded in a packaged system is the 
responsibility of the OEM warranty (HP, Dell, etc etc).   'whitebox' OEM 
stuff bought as parts at computer stores, you're the OEM, and they have 
some level of warranty from Seagate, but I forget what it is 
specifically, its likely to be 1 year.



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread Matt Hyclak
On Thu, Sep 11, 2008 at 11:07:25AM -0700, MHR enlightened us:
 On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
 [EMAIL PROTECTED] wrote:
 
  I had similar messages on this laptop.  Acer accepted liability and replaced
  the disk.
 
 I'm pretty sure the Seagate warranty is no longer in force - most of
 them are a year.  Maxtor's are, for sure - I've had to go through that
 once before, and they were quite cooperative, too, but that was a few
 years back (before Seagate bought them).
 

Seagate has a 5 year warranty on its drives. You might check again.

Matt

-- 
Matt Hyclak
Systems and Operations 
Office of Information Technology
Ohio University
(740) 593-1222


pgpvk2HvGrNQb.pgp
Description: PGP signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread Les Mikesell

Matt Hyclak wrote:

On Thu, Sep 11, 2008 at 11:07:25AM -0700, MHR enlightened us:

On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
[EMAIL PROTECTED] wrote:

I had similar messages on this laptop.  Acer accepted liability and replaced
the disk.


I'm pretty sure the Seagate warranty is no longer in force - most of
them are a year.  Maxtor's are, for sure - I've had to go through that
once before, and they were quite cooperative, too, but that was a few
years back (before Seagate bought them).



Seagate has a 5 year warranty on its drives. You might check again.



Just put the serial number in here:
http://support.seagate.com/customer/warranty_validation.jsp

--
  Les Mikesell
   [EMAIL PROTECTED]

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Alarming (?) smartd reports

2008-09-11 Thread R P Herrold

On Thu, 11 Sep 2008, MHR wrote:


On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
[EMAIL PROTECTED] wrote:


I had similar messages on this laptop.  Acer accepted liability and replaced
the disk.


I'm pretty sure the Seagate warranty is no longer in force - most of
them are a year.


 you patronize the wrong vendors -- 5 year Seagate 
warranty and low defects are my expereince.


-- Russ herrold
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread Chris Geldenhuis

MHR wrote:

On Thu, Sep 11, 2008 at 12:26 AM, Anne Wilson
[EMAIL PROTECTED] wrote:
  

I had similar messages on this laptop.  Acer accepted liability and replaced
the disk.



I'm pretty sure the Seagate warranty is no longer in force - most of
them are a year.  Maxtor's are, for sure - I've had to go through that
once before, and they were quite cooperative, too, but that was a few
years back (before Seagate bought them).

mhr
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

  
I recently returned a disk that was  2 years old to teh Seagate agents 
here in SA. They swapped it out without any hassles - as far as I 
remember they then said that thwe warranty was 5 years


ChrisG
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread William L. Maltby

On Thu, 2008-09-11 at 11:03 -0700, MHR wrote:
 On Wed, Sep 10, 2008 at 10:14 PM, nate [EMAIL PROTECTED] wrote:
 
  Download the manufacturer's tools and run a diagnostics on it,
  it will tell you the truth about what's going on.
 
  I wouldn't trust any generic OS tools over the manufacturer's tools,
 snip

 I was going to laugh this off 'cuz how many manufacturers support
 Linux, but I was pleasantly surprised, twice, when I found that a)
 Seagate does and b) the seatools for Linux produced no errors on the
 long test.

IIRC, the seatools just run the smart tools that come on CentOS/Linux.
Not the same as those on the DOS tools version. It's been several
months, but barring memory failures (mine, not the computer's  ;-) I
ended up downloading the DOS ones so that I could do the repair and run
the real magilla.

 
 It also told me lots of interesting information that I don't recall at
 the moment, not the least of which was that the drive does not support
 DST (the on-board diagnostics test), which I thought was odd.

Try the DOS version. I bet the lack of that support is in the standard
*IX smart tools, not the drive.

 snip

 One other thing that I find interesting: the drives that are showing
 smart errors are /dev/hdb and /dev/sda.  In order from oldest to
 newest, my drives are:
 
 /dev/hdb - Maxtor 120GB PATA
 /dev/hda - Maxtor 160GB PATA
 /dev/sda - Seagate 300GB SATA
 /dev/sdb - WD 320GB SATA
 
 The older of each of the PATA and SATA drives are the ones showing the
 errors

If all drives left the factory in great shape, it is natural that the
older ones would show an error first. Often just a weak spot or two
that passed mfg tests and finally failed as they aged. That's why I
don't worry about them (I don't have data center servers to the world
here at home) as long as the repair utilities run successfully and then
no more show up for a long time. If they start coming in frequent
bursts, then it's time to act.

BTW, most warranty replacements are reconditioned drives that have
nothing more than diagnostics run and bad sectors reassigned. As long as
total capacity still meets advertised and the mechanics/electrics and
media (high %) are still good, they'll ship them.

 
 Thanks.
 
 mhr
 snip sig stuff

HTH
-- 
Bill

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-11 Thread MHR
On Thu, Sep 11, 2008 at 2:05 PM, William L. Maltby
[EMAIL PROTECTED] wrote:


 IIRC, the seatools just run the smart tools that come on CentOS/Linux.
 Not the same as those on the DOS tools version. It's been several
 months, but barring memory failures (mine, not the computer's  ;-) I
 ended up downloading the DOS ones so that I could do the repair and run
 the real magilla.


I'm not so sure about that, but I'd have to check.  It was the
Seatools program, not smartctl (at least not directly).
And it's megilla, ya goysiher kopf!

 Try the DOS version. I bet the lack of that support is in the standard
 *IX smart tools, not the drive.


I don't think so - it only commented on these from the Seagate, not
the WD, and it explicitly states that the DST is not supported on the
drive (although that is /just/ ambiguous enough...).

 If all drives left the factory in great shape, it is natural that the
 older ones would show an error first. Often just a weak spot or two
 that passed mfg tests and finally failed as they aged. That's why I
 don't worry about them (I don't have data center servers to the world
 here at home) as long as the repair utilities run successfully and then
 no more show up for a long time. If they start coming in frequent
 bursts, then it's time to act.


Well, yeah, of course, but why would my Max 160 be error free and the
Seagate have 4 billion when the latter is (a year or so) newer?
(Rhetorical question!)

 BTW, most warranty replacements are reconditioned drives that have
 nothing more than diagnostics run and bad sectors reassigned. As long as
 total capacity still meets advertised and the mechanics/electrics and
 media (high %) are still good, they'll ship them.


I've noticed that - really annoying, but then, what're ya gonna do
when there's no will to enact laws requiring manufacturers to provide
quality products to begin with, and then replace them appropriately
under warranty?

Ciao.

mhr
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Alarming (?) smartd reports

2008-09-10 Thread MHR
I decided, after the last discussion of smartd and S.M.A.R.T. disks,
to take a look in my /var/log/messages, and I'm seeing fair bit of
this:

Sep 10 20:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
Offline uncorrectable sectors
Sep 10 20:41:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently
unreadable (pending) sectors
Sep 10 20:41:24 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
Currently unreadable (pending) sectors
Sep 10 20:41:24 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
Offline uncorrectable sectors
Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently
unreadable (pending) sectors
Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
Currently unreadable (pending) sectors
Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295
Offline uncorrectable sectors

Clearly there is a minor problem on /dev/hdb, which doesn't really
surprise me, nor is it particularly worrisome (because I don't use
that drive much).

However, the other one I find more than a little curious.

/dev/sda is a Seagate 300GB SATA drive that's coming up on two years
old next month, but the number of Currently unreadable (pending)
sectors or Offline uncorrectable sectors, depending on which one
you believe, is interesting - 4294967295 is  in hex, and I'm
running a 64-bit machine.

Google is not particularly informative on this subject - anyone know
more than general suggestions about dd, badblocks, etc.?  This is my
boot and primary system disk (has been for some time), but the error
message is essentially meaningless (to me, right now).

Thanks.

mhr

PS: In the last couple of months there was a discussion of how to make
the disk less active, starting with someone reporting that their disk
drive activity light blinked every 30 seconds or something like that.
I tried to find it again, but I couldn't pin down what to search for -
what was the solution?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Alarming (?) smartd reports

2008-09-10 Thread nate
MHR wrote:

 Google is not particularly informative on this subject - anyone know
 more than general suggestions about dd, badblocks, etc.?  This is my
 boot and primary system disk (has been for some time), but the error
 message is essentially meaningless (to me, right now).

Download the manufacturer's tools and run a diagnostics on it,
it will tell you the truth about what's going on.

I wouldn't trust any generic OS tools over the manufacturer's tools,
there was a discussion on this topic on this list I think not too
long ago. The biggest gotcha with the vendor tools though is
they are usually limited in the types of disk controllers they
support.

nate

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos