On 01/06/12 17:26, Timothy Musson wrote:
Hi all,
I'm maintaining a computer running Ubuntu 10.04LTS for my grandfather.
The computer's just over a year old, and hasn't had any trouble until
a couple of days ago when my grandfather told me it wouldn't boot.
So, checking it out...
The first time I tried to start it up, it seemed to hang just after
loading grub (...black screen apart from a flashing cursor at top
left). The hard drive light was not active at all. Minutes passed.
Nothing happened.
The second time, Ubuntu started up - but slowly (say, five minutes
from power-on to gdm). Once running all seemed fine - except for the
following messages in /var/log/messages:
May 30 23:51:09 hobart kernel: [ 340.299182] sr 1:0:0:0: CDB: Test
Unit Ready: 00 00 00 00 00 00
May 30 23:51:09 hobart kernel: [ 340.309401] ata2: soft resetting link
May 30 23:51:10 hobart kernel: [ 340.488255] ata2.00: configured for UDMA/66
May 30 23:51:10 hobart kernel: [ 340.756317] ata2.00: TEST_UNIT_READY
failed (err_mask=0x2)
May 30 23:51:15 hobart kernel: [ 345.464026] ata2: soft resetting link
May 30 23:51:15 hobart kernel: [ 345.644250] ata2.00: configured for UDMA/66
May 30 23:51:17 hobart kernel: [ 348.000352] ata2.00: TEST_UNIT_READY
failed (err_mask=0x2)
May 30 23:51:17 hobart kernel: [ 348.000359] ata2.00: limiting speed
to UDMA/66:PIO3
May 30 23:51:20 hobart kernel: [ 350.620020] ata2: soft resetting link
May 30 23:51:25 hobart kernel: [ 355.776150] ata2.00: qc timeout (cmd 0xa1)
May 30 23:51:25 hobart kernel: [ 355.776159] ata2.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
May 30 23:51:25 hobart kernel: [ 355.779671] ata2.00: disabled
May 30 23:51:30 hobart kernel: [ 360.816022] ata2: link is slow to
respond, please be patient (ready=0)
May 30 23:51:35 hobart kernel: [ 365.800022] ata2: device not ready
(errno=-16), forcing hardreset
May 30 23:51:35 hobart kernel: [ 365.800032] ata2: soft resetting link
May 30 23:51:35 hobart kernel: [ 365.956188] ata2: EH complete
That sequence of messages repeated over and over.
The third time I rebooted, all was fine, the OS loaded as quickly as
ever, and there was nothing weird being written to /var/log/messages.
What I've tried:
Via Ubuntu's "Disk Utility", I've tried the "SMART" self tests on the
dodgy drive. It found nothing wrong and calls the drive "healthy".
I've run memtest86+ for 24 hours, and the computer's memory (2Gb) seems fine.
The drive seems to be behaving itself now, but I'm not sure I trust it.
What you you think? Is this what happens when a hard drive begins to fail?
(We do have daily backups on an external drive, so I'm not tooo
worried about disaster at this point.)
Remember that to actually get an error message out of a disk means that
countless levels of internal buffering and error checking have failed.
As soon as you get any kind of error reported from any HDD, make a disk
copy (dd) if possible, and retire it. use the copy to restore content
from the disk.. which is going to fail real soon now!
Steve.
_______________________________________________
Linux-users mailing list
[email protected]
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users