On Tue, 28 Aug 2012 16:15:33 +0200, Merciadri Luca wrote: > I'm recurrently getting freezes because of HDD problems. During these > freezes, that generally last until I shut down the computer, I get such > messages: > > == > smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build) > Copyright (C) 2002-10 by Bruce Allen, > http://smartmontools.sourceforge.net > > === START OF INFORMATION SECTION === > Model Family: Maxtor DiamondMax Plus 9 family > Device Model: Maxtor 6Y160M0
(...) Do you hear any "clicking" sound coming from the hard disk? Anyway, if my memory serves me well, that hard disk model has to be at least 8 or more years... > Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000030] ata6.00: > exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen > Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000035] ata6: SError: { > UnrecovData Handshk } > Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000038] ata6.00: failed > command: WRITE DMA EXT (...) > After restarting, I got messages such as > > == > Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816026] ata4.00: > exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen > Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816031] ata4: SError: { > UnrecovData Handshk } > Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816035] ata4.00: failed > command: WRITE DMA > Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816040] ata4.00: cmd > ca/00:90:08:71:05/00:00:00:00:00/e0 tag 0 dma 73728 out (...) > and also > > == > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572574] sd 3:0:0:0: > [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572578] sd 3:0:0:0: > [sdc] Sense Key : Aborted Command [current] [descriptor] > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572582] Descriptor sense > data with sense descriptors (in hex): > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572584] 72 0b 00 > 00 00 00 00 0c 00 0a 80 00 00 00 00 00 > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572592] 00 00 00 > 00 > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572596] sd 3:0:0:0: > [sdc] Add. Sense: No additional sense information > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572600] sd 3:0:0:0: > [sdc] CDB: Write(10): 2a 00 00 05 83 00 00 03 90 00 > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572608] end_request: I/O > error, dev sdc, sector 361216 > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572613] Buffer I/O error > on device sdc5, logical block 43136 > Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572615] lost page write > due to I/O error on sdc5 (...) > It looks like the HDD associated with sdc is encountering some issues. And more specifically, "/dev/sdc5" partition. > But is sdc linked to ata4 or ata6? Do these two problems (before and > after restarting) are the same ones or not? Yes, it seems there are two hard disks affected. Run: dmesg | grep -i ata[0-6] > After running several short and long tests with S.M.A.R.T. on each of my > 3 HDDs, I got these results: > > 1) HDD associated with /dev/sda looks in some pre-failure state: (...) > SMART Error Log Version: 1 > Warning: ATA error count 454 inconsistent with error log pointer 5 I would run here the manufacturer's test disk but this one looks it's a bit tired. You can keep monitoring the tagged "pre-fail" values and proceed with a hard disk replacement as soon as these are quickly increased. > 2) HDD associated with /dev/sdb verifies (...) > (this is the one that looks the healthiest, actually). Agreed. > 3) The HDD associated with /dev/sdc, which should be in some way broken > (being given the messages that I wrote above from /var/log/syslog), does > not look so through SMART: (...) Oh my... consider also to run the manufacturer's smart test utility for this one... and make a full backup _now_. > What can I deduce from this? It looks like /dev/sdc is broken but SMART > tells /dev/sda would have more chance being on the verge to broke than > /dev/sdc. I can deduce that Maxtor hard disks are very old and would deserve for a retirement, eventhough they are still up and (somehow) running. > Note that I tried exchanging SATA cables, to no avail. In your case there are logged errors regarding sectors and I/O errors and this is dangerous. Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/k1iuri$ik9$2...@ger.gmane.org