-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
I'm recurrently getting freezes because of HDD problems. During these freezes, that generally last until I shut down the computer, I get such messages: == smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Maxtor DiamondMax Plus 9 family Device Model: Maxtor 6Y160M0 Serial Number: Y44NQSTE Firmware Version: YAR51HW0 User Capacity: 163,928,604,672 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is: Tue Aug 28 16:09:09 2012 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED [...] SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000030] ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000035] ata6: SError: { UnrecovData Handshk } Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000038] ata6.00: failed command: WRITE DMA EXT Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000044] ata6.00: cmd 35/00:80:00:4f:f5/00:01:12:00:00/e0 tag 0 dma 196608 out Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000046] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error) Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000049] ata6.00: status: { DRDY } Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000056] ata6: hard resetting link Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.476042] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.597999] ata6.00: configured for UDMA/133 Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.598003] ata6.00: device reported invalid CHS sector 0 Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.598008] ata6: EH complete Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965242] ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965247] ata6: SError: { UnrecovData Handshk } Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965251] ata6.00: failed command: WRITE DMA EXT Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965257] ata6.00: cmd 35/00:80:00:4f:f5/00:01:12:00:00/e0 tag 0 dma 196608 out Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965258] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error) Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965261] ata6.00: status: { DRDY } Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965269] ata6: hard resetting link Aug 28 10:22:10 merciadriluca-station kernel: [ 2191.440043] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546566] ata6.00: configured for UDMA/133 Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546571] ata6.00: device reported invalid CHS sector 0 Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546578] ata6: EH complete == After restarting, I got messages such as == Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816026] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816031] ata4: SError: { UnrecovData Handshk } Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816035] ata4.00: failed command: WRITE DMA Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816040] ata4.00: cmd ca/00:90:08:71:05/00:00:00:00:00/e0 tag 0 dma 73728 out Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816042] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error) Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816045] ata4.00: status: { DRDY } Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816053] ata4: hard resetting link Aug 28 11:01:35 merciadriluca-station kernel: [ 234.292041] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411821] ata4.00: configured for UDMA/133 Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411826] ata4.00: device reported invalid CHS sector 0 Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411831] ata4: EH complete Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780026] ata4: limiting SATA link speed to 1.5 Gbps Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780030] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780034] ata4: SError: { UnrecovData Handshk } Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780038] ata4.00: failed command: WRITE DMA EXT Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780044] ata4.00: cmd 35/00:90:00:83:05/00:03:00:00:00/e0 tag 0 dma 466944 out Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780045] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error) Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780048] ata4.00: status: { DRDY } Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780056] ata4: hard resetting link Aug 28 11:02:14 merciadriluca-station kernel: [ 273.256538] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382089] ata4.00: configured for UDMA/133 Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382093] ata4.00: device reported invalid CHS sector 0 Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382098] ata4: EH complete Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380023] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400101 action 0x6 frozen Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380028] ata4: SError: { RecovData UnrecovData Handshk } Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380032] ata4.00: failed command: WRITE DMA EXT Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380038] ata4.00: cmd 35/00:90:00:83:05/00:03:00:00:00/e0 tag 0 dma 466944 out Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380039] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error) Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380042] ata4.00: status: { DRDY } == and also == Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572574] sd 3:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572578] sd 3:0:0:0: [sdc] Sense Key : Aborted Command [current] [descriptor] Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572582] Descriptor sense data with sense descriptors (in hex): Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572584] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572592] 00 00 00 00 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572596] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572600] sd 3:0:0:0: [sdc] CDB: Write(10): 2a 00 00 05 83 00 00 03 90 00 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572608] end_request: I/O error, dev sdc, sector 361216 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572613] Buffer I/O error on device sdc5, logical block 43136 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572615] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572622] Buffer I/O error on device sdc5, logical block 43137 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572625] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572629] Buffer I/O error on device sdc5, logical block 43138 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572631] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572636] Buffer I/O error on device sdc5, logical block 43139 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572638] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572642] Buffer I/O error on device sdc5, logical block 43140 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572644] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572648] Buffer I/O error on device sdc5, logical block 43141 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572651] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572655] Buffer I/O error on device sdc5, logical block 43142 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572657] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572661] Buffer I/O error on device sdc5, logical block 43143 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572663] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572667] Buffer I/O error on device sdc5, logical block 43144 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572669] lost page write due to I/O error on sdc5 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572674] Buffer I/O error on device sdc5, logical block 43145 Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572676] lost page write due to I/O error on sdc5 == It looks like the HDD associated with sdc is encountering some issues. But is sdc linked to ata4 or ata6? Do these two problems (before and after restarting) are the same ones or not? After running several short and long tests with S.M.A.R.T. on each of my 3 HDDs, I got these results: 1) HDD associated with /dev/sda looks in some pre-failure state: == SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0027 203 202 063 Pre-fail Always - 19440 4 Start_Stop_Count 0x0032 252 252 000 Old_age Always - 3294 5 Reallocated_Sector_Ct 0x0033 252 252 063 Pre-fail Always - 17 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance 0x0027 252 237 187 Pre-fail Always - 46578 9 Power_On_Minutes 0x0032 172 172 000 Old_age Always - 1007h+24m 10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 245 245 000 Old_age Always - 3314 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0 194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 56 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 8324 196 Reallocated_Event_Count 0x0008 238 238 000 Old_age Offline - 15 197 Current_Pending_Sector 0x0008 252 252 000 Old_age Offline - 15 198 Offline_Uncorrectable 0x0008 237 001 000 Old_age Offline - 16 199 UDMA_CRC_Error_Count 0x0008 195 194 000 Old_age Offline - 5 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 0 202 Data_Address_Mark_Errs 0x000a 253 226 000 Old_age Always - 0 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 8 204 Soft_ECC_Correction 0x000a 253 251 000 Old_age Always - 0 205 Thermal_Asperity_Rate 0x000a 253 252 000 Old_age Always - 0 207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0 208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0 209 Offline_Seek_Performnce 0x0024 194 189 000 Old_age Offline - 0 99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 SMART Error Log Version: 1 Warning: ATA error count 454 inconsistent with error log pointer 5 ATA Error Count: 454 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 454 occurred at disk power-on lifetime: 14837 hours (618 days + 5 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 24 81 02 32 e0 Error: UNC 36 sectors at LBA = 0x00320281 = 3277441 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 d0 00 81 02 32 e0 00 02:36:40.624 READ DMA EXT 25 d0 d2 af 01 32 e0 00 02:36:40.624 READ DMA EXT 25 d0 2e 81 e0 31 e0 00 02:36:40.624 READ DMA EXT 25 d0 00 81 df 31 e0 00 02:36:40.608 READ DMA EXT 25 d0 d2 af de 31 e0 00 02:36:40.608 READ DMA EXT Error 453 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 01 52 27 0f e0 Error: UNC at LBA = 0x000f2752 = 993106 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 42 d0 01 52 27 0f e0 00 03:46:51.472 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:51.472 READ DMA EXT 42 d0 01 51 27 0f e0 00 03:46:50.464 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:50.448 READ DMA EXT 42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT Error 452 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 01 51 27 0f e0 Error: UNC at LBA = 0x000f2751 = 993105 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 42 d0 01 51 27 0f e0 00 03:46:50.464 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:50.448 READ DMA EXT 42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT 42 d0 02 4f 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT 42 d0 04 53 27 0f e0 00 03:46:48.640 READ VERIFY SECTOR(S) EXT Error 451 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 02 51 27 0f e0 Error: UNC at LBA = 0x000f2751 = 993105 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT 42 d0 02 4f 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT 42 d0 04 53 27 0f e0 00 03:46:48.640 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:48.624 READ DMA EXT 42 d0 04 4f 27 0f e0 00 03:46:47.616 READ VERIFY SECTOR(S) EXT Error 450 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 02 4f 27 0f e0 Error: UNC at LBA = 0x000f274f = 993103 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 42 d0 04 4f 27 0f e0 00 03:46:47.616 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:47.616 READ DMA EXT 42 d0 08 57 27 0f e0 00 03:46:47.600 READ VERIFY SECTOR(S) EXT 25 d0 01 00 00 00 e0 00 03:46:47.600 READ DMA EXT 42 d0 08 4f 27 0f e0 00 03:46:46.576 READ VERIFY SECTOR(S) EXT SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 10% 26543 319759751 # 2 Short offline Completed: read failure 60% 26542 319759751 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == Short offline test ends at 40% completed, and extended offline one ends at 90% completed, the LBA of the first error being 319759751 in both cases. 2) HDD associated with /dev/sdb verifies == smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.10 family Device Model: ST3320620AS Serial Number: 9QFAYRCP Firmware Version: 3.AAG User Capacity: 320,072,933,376 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Aug 28 16:11:54 2012 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED [...] SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1753 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 355938474 9 Power_On_Hours 0x0032 083 083 000 Old_age Always - 15739 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1745 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 053 048 045 Old_age Always - 47 (Lifetime Min/Max 47/48) 194 Temperature_Celsius 0x0022 047 052 000 Old_age Always - 47 (0 20 0 0) 195 Hardware_ECC_Recovered 0x001a 065 055 000 Old_age Always - 1306602 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == (this is the one that looks the healthiest, actually). 3) The HDD associated with /dev/sdc, which should be in some way broken (being given the messages that I wrote above from /var/log/syslog), does not look so through SMART: == smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Maxtor DiamondMax 21 Device Model: MAXTOR STM3320820AS Serial Number: 5QF2T6W6 Firmware Version: 3.AAE User Capacity: 320,072,933,376 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Aug 28 16:12:32 2012 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED [...] SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 092 085 006 Pre-fail Always - 63613073 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2362 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 574383816 9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 18552 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2386 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 054 046 045 Old_age Always - 46 (Lifetime Min/Max 45/47) 194 Temperature_Celsius 0x0022 046 054 000 Old_age Always - 46 (0 12 0 0) 195 Hardware_ECC_Recovered 0x001a 065 052 000 Old_age Always - 222324542 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 2 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 18551 - # 2 Extended offline Completed without error 00% 18493 - # 3 Short offline Completed without error 00% 18492 - # 4 Short offline Completed without error 00% 13106 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == What can I deduce from this? It looks like /dev/sdc is broken but SMART tells /dev/sda would have more chance being on the verge to broke than /dev/sdc. Note that I tried exchanging SATA cables, to no avail. All the best, - -- Merciadri Luca See http://www.student.montefiore.ulg.ac.be/~merciadri/ - -- It's the early bird that gets the worm. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/> iEYEARECAAYFAlA80oQACgkQM0LLzLt8MhwUGgCbB9WOOBb3vHlorBnymavWCvmY aBkAnRbCcc2WZK+AXQTcwqKTGyt0ph/b =OzHm -----END PGP SIGNATURE----- -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87ligz5chm.fsf@merciadriluca-station.MERCIADRILUCA