Re: 7.2-RELEASE-p4, IO errors RAID1 failure
Hi Matthew, I'm running 7.2-RELEASE-p4 on an i386 HP server (ML G5) in RAID1 configuration. Very recently, I've seen IO errors such as: ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=20472527 reported and the RAID mirror is now offline. ad0: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=395032335 ad0: FAILURE - WRITE_DMA48 status=51READY,DSC,ERROR error=10NID_NOT_FOUND LBA=395032335 ar0: WARNING - mirror protection lost. RAID1 array in DEGRADED mode I had more or less the same timeout issues on my 8.0-RELEASE box on a Dell R300 with SATA disks. What I did was raise the ata timeout from 5 seconds to 20. I did this by patching the kernel code while running, but I'm not sure you'd like that approach ;) In http://www.freebsd.org/cgi/query-pr.cgi?pr=111023 a patch is presented that raises the timeouts by patching a few ATA kernel source files. This has been committed to RELENG_7 as well, so by upgrading your 7.2-install to the latest RELENG_7 (or RELENG_8), you'll have that timeout fix. Why ATA commands can take longer than 5 seconds although the disks appear to be fine.. wouldn't know .. -- Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Hi Jeremy, SNIP: both old disks were fine Anyway, if heavy disk/controller load appears to be causing these problems, you could have power-related issues. Possibly the combination of two disks + heavy I/O causes enough power draw that the ICH9 starts to behave oddly. Voltages which deviate too much can cause odd things to happen to hardware. If you have the time/money, you might try replacing the PSU in your system to see if there's any improvement; your BIOS should be able to provide you Hardware Monitoring statistics (voltages). Write these down before and after the PSU swap. You don't need to go crazy and buy a 1000W PSU or anything, but 450-750W is pretty normal these days. As this is a 19 1U box, I'd need to buy a replacement PSU from Dell or a reseller. Not too expensive, but I'd like to avoid that. While looking through the CVSweb of RELENG_8, I found that ATA timeouts have been raised in 8 recently. On http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting and other URLs, like http://linux-bsd-sharing.blogspot.com/2009/03/howto-fix-sata-dma-timeout-issues-on.html, I found that increasing the timeout might help. So that's what I'll try next time it happens again. If that still doesn't work, I can take a better look at the voltage levels. -- Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Hi Jeremy, Lots to say about all of this. Thanks for your elaborate reply, it was very useful to see smartctl output explained a bit :) I still think there's something else in play beside disk failure. I've checked one of the drives I replaced earlier, but that one doesn't have any of the errors in its SMART output you described, although it did drop out of the mirror multiple times during its lifetime. The WD Caviar Black drives have a useful feature called TLER -- it's disabled by default, for reasons which I don't want to get into here -- which can force the drive to internally give up after X seconds (it's user-selectable) when dealing with such remapping/errors. The idea is to keep the drive from being deemed dead from the OS/controller's point of view. I believe Seagate, Hitachi, or Samsung (I forget which) have this feature as well, but it's not called TLER. I've read about this feature, but didn't have the time to try to get it turned on (iirc you'd need a specific Western Digital DOS-based util or something). If you want to find out the exact LBA that has the problem (there may be more than one), I can step you through performing a selective LBA scan using SMART, since this model of disk does support such. It's easy to do, easy to understand the results, and can be done while the drive is in operation (though I would recommend trying to keep disk I/O to a minimum during this test). Let me know. At a certain point in time I had read errors from specific LBA's on ad4. Using dd I was able to pinpoint those to single sectors. Overwriting those sectors with what was on ad6 made them readable again. What is odd is that the 'remapped sector' count of ad4 is 0. Still I'd like to know how do perform such a scan. Finally, your vmstat -i output: # vmstat -i interrupt total rate irq23: atapci0 371021299 10423 Good to know there's no IRQ sharing going on, but what does worry me is the interrupt rate (10K interrupts/second). That seems *extremely* high, but it also depends on what kind of disk I/O is happening on this system -- especially since you have 2 disks attached to the same controller. The rate is higher than 1 also at idle. During a gmirror sync from ad6 to ad4, it's about 10670. iostat 1, iostat -x 1, or gstat might come in handy to tell you what kind of disk I/O is going on. If actual I/O is very little, then something weird is going on with regards to the number of interrupts being seen on IRQ 23. mav@ might have some ideas, otherwise I'd recommend rebooting the machine and seeing if the number drops. If so, it may be that the OS has some sort of bug where a disk timing out or falling off the bus causes interrupt problems. (It's too bad you don't have AHCI on this system. It handles stuff like this much more elegantly...) If mav@ or anyone else doesn't have another insight in the interrupt rate, I guess a reboot will at least show if it's persistent or related to the errors. I'll try to do a reboot when convenient (probably sunday morning or something). Thanks, Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Hi Terry, I have a bunch of R300's here. From one that is using the on-board SATA and 2 drives in a gmirror setup (very similar to the OP) after 18 hours of uptime: [0:2] speedtest:~ vmstat -i interrupt total rate irq23: atapci0254116 3 Interesting. Which version of FreeBSD is this system running? I guess you didn't experience any of the timeouts I'm seeing? I also have another R300 with Dell's SAS 6/iR card (a re-branded LSI 1068-something, seen as mpt by FreeBSD). While Dell only sells that as part of a package deal with the hot-swap backplane and redundant power supplies, there's no reason you couldn't pick one up on eBay and add it yourself. You'll need some sort of breakout cable to get from the big connector on the SAS 6 to individual SATA ports. Yeah, this R300 was bought second-hand and unfortunately the owner pulled the RAID card out. It's something to consider, getting one of those cards. Do you use the RAID-features of the drive and if so, does that work well? I'm a bit hesitant to use hardware raid; it would be a big plus if the RAID disks could also be used stand-alone if need be (which is easy with gmirror because of its metadata being stored in the drive's last sector). Thanks, Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Hi there, what kind of disk I/O is going on. If actual I/O is very little, then something weird is going on with regards to the number of interrupts being seen on IRQ 23. mav@ might have some ideas, otherwise I'd recommend rebooting the machine and seeing if the number drops. If so, it may be that the OS has some sort of bug where a disk timing out or falling off the bus causes interrupt problems. (It's too bad you don't have AHCI on this system. It handles stuff like this much more elegantly...) Well, due to a UFS snapshot panic the box was rebooted, and now I only see around 1500 interrupts per second, while syncing the mirror. -- Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Hi, SNIP: disk without errors timing out That could be caused by a multitude of other known things. For example, some Western Digital Green drives (including the Enterprise class ones) are known to perform head parking/offloading excessively, which could result in the drive spending more time doing that than actually serving overall I/O requests. There are some other reports of Samsung Spinpoint drives experiencing other issues (I've since forgotten and would have to dig up the threads). If you could provide full SMART stats for that drive, it might help. Attached the SMART output of both disks I replaced about a month ago. It appears I replaced perfectly fine drives with the current disks with errors ;( One of the old disks is in a USB-enclosure now, so 'da0'. SNIP: enabling TLER Yes, it's a DOS-based utility (like most firmware upgraders these days). I can provide it if you'd like. I've been meaning to spend some time trying to reverse-engineer the binary to figure out what ATA commands it sends to the disk to toggle/adjust the feature (so that one could do it in real-time rather than have to boot into DOS). I'd like to try that tool. Since the old WD disks are now lying around at home, I have some time to get a DOS boot working to try it out. A FreeBSD-implementation of the WD tool and possibly other brands would be really useful indeed. At a certain point in time I had read errors from specific LBA's on ad4. Using dd I was able to pinpoint those to single sectors. This isn't very effective (dd will read large chunks/amounts of data (read: multiple LBAs) from the underlying disk at once, rather than the disk itself performing a per-LBA test). My opinion is that the dd method should only be used on drives which don't support selective LBA scanning via SMART. Will dd read multiple LBAs even when using 'bs=512'? The process I used was reading using bs=8192, then zooming in on the LBA's mentioned in the errors in dmesg with bs=512 to find the actual LBA. A selective scan on ad4 did not reveal any errors today: it 'completed without error'. On ad6 it's a whole lot slower; at the time of writing it's at 2/3. All HD vendors have their own quirks/ordeals right now. You basically just have to go with one who works wells for you, then if things start going downhill, switch to another. None of them are perfect. I figured as much. What irritates though is that I've had consistent problems with 4 disks in this specific system, but not (such) issues with any other disk in other systems I've had. I generally replace disks when I grow out of them, not because they break down. What this indicates to me is that if a disk falls off the bus on an ICH9 controller in Enhanced (non-AHCI) mode, FreeBSD starts seeing an absurd number of interrupts generated from the ICH9. My guess is FreeBSD isn't doing something correctly with the controller when this happens; maybe certain commands aren't being sent back to the controller or handling of certain events are being done improperly when it comes to ICH9 (or possibly earlier ICH revisions too). This should be *very* easy to reproduce. Unfortunately I'm not really in a position to help reproducing this or testing possible fixes; downtime is currently very unwelcome. Although one of the previous disks indeed fell of the bus entirely (couldn't get it back with atacontrol either), that hasn't happened again so far. I only see timeouts (and a few days ago read errors on ad4) which gmirror doesn't like. I guess those aren't that simple to reproduce (apart from on my system ;). If you see any of your disks on the ICH9 controller fall off the bus or report ATA errors (doesn't matter what kind), please make note of the timestamp (should be in the kernel log), and ASAP run smartctl -a on the disk. You should compare attributes before and after the event. You might also want to consider using smartd, which can log SMART attribute changes on its own. Note that you might have to tune the arguments in smartd.conf to ignore some attributes which fluctuate naturally (such as drive temperature and seek error rate). I've configured smartd to poll both disks every 5 minutes. I -think- the issues happen specifically under load: the periodic scripts of the host and its 4 jails appear to trigger it sometimes. At that time I'm normally trying to get some sleep, so smartd will have to do for now. Although I'll run a smartctl -a asap anyway. -- Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Attached the SMART output of both disks I replaced about a month ago. It appears I replaced perfectly fine drives with the current disks with errors ;( One of the old disks is in a USB-enclosure now, so 'da0'. Let's send those attachments, then. -- Pieter smartctl 5.39 2009-12-09 r2995 [FreeBSD 8.0-STABLE i386] (local build) Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital RE3 Serial ATA family Device Model: WDC WD5002ABYS-18B1B0 Serial Number:WD-WMASY5474089 Firmware Version: 02.03B03 User Capacity:500,107,862,016 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Sat May 15 21:53:04 2010 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (9480) seconds. Offline data collection capabilities:(0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 112) minutes. Conveyance self-test routine recommended polling time:( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051Pre-fail Always - 0 3 Spin_Up_Time0x0027 179 179 021Pre-fail Always - 4033 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 89 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000Old_age Always - 5536 10 Spin_Retry_Count0x0032 100 253 000Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 74 192 Power-Off_Retract_Count 0x0032 200 200 000Old_age Always - 71 193 Load_Cycle_Count0x0032 200 200 000Old_age Always - 89 194 Temperature_Celsius 0x0022 100 094 000Old_age Always - 47 196 Reallocated_Event_Count 0x0032 200 200 000Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000Old_age Offline - 0 199 UDMA_CRC_Error_Count0x0032 200 200 000Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_DescriptionStatus Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offlineCompleted without error 00% 5487 - # 2 Extended offlineCompleted without error 00%
Read / write timeouts on SATA disks connected to ICH9
Hi list, I'm running FreeBSD 8.0-RELEASE-p1 on a Dell R300 which has a ICH9 SATA controller on-board (do not have the RAID controller). The system has 2 disks in a gmirror setup. Every now and then, probably under some load, one of the disks gets read or write timeouts like: May 5 03:01:37 aberdeen kernel: ad4: timeout waiting to issue command May 5 03:01:37 aberdeen kernel: ad4: error issuing WRITE_DMA48 command May 5 03:01:37 aberdeen kernel: GEOM_MIRROR: Request failed (error=5). ad4[WRITE(offset=200404975104, length=16384)] May 5 03:01:37 aberdeen kernel: GEOM_MIRROR: Device gm0: provider ad4 disconnected. or: May 13 14:41:26 aberdeen kernel: ad6: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=975513887 Sometimes the read/write succeeds after a few retries, but sometimes it does not, so geom_mirror throws the disk out of the mirror. Tonight ad6 was thrown out of the mirror and ad4 then gave actual read errors, resulting in a big mess :( My question: does anyone have experience with FreeBSD on a Dell R300 or can anyone give me some help in trying to fix the timeouts? I was told using AHCI could be better for SATA disks, but apparently (http://permalink.gmane.org/gmane.linux.kernel.pci/8267) the BIOS does not support turning that on, so that does not appear to be an option. Thanks, Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
Adam Vande More wrote: May 5 03:01:37 aberdeen kernel: ad4: timeout waiting to issue command May 5 03:01:37 aberdeen kernel: ad4: error issuing WRITE_DMA48 command May 5 03:01:37 aberdeen kernel: GEOM_MIRROR: Request failed (error=5). ad4[WRITE(offset=200404975104, length=16384)] May 5 03:01:37 aberdeen kernel: GEOM_MIRROR: Device gm0: provider ad4 disconnected. Have you tried replacing/checking the cables? Does it always happen to ad4? Your drive could be dying, try swapping it out and see if the errors persist. It happens to both drives and to both drives I replaced a month ago with these. Didn't replace the cables back then, but they were correctly attached and are now. Also it would be odd that both cables are broken at the same time. -- Pieter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Read / write timeouts on SATA disks connected to ICH9
My question: does anyone have experience with FreeBSD on a Dell R300 or can anyone give me some help in trying to fix the timeouts? Could you please do the following: - Provide output from vmstat -i - Provide output from dmesg | grep -i ata - Install ports/sysutils/smartmontools (5.40 or later) and provide full output from commands smartctl -a /dev/ad4 and smartctl -a /dev/ad6 The ad4 SMART output is showing errors, as this disk is indeed broken now. It wasn't before and it is a replacement of another disk that wasn't broken either. Grmbl, I now see reallocated sectors on ad6 as well, in the smartctl output. So both disks look wonky; although afaik that's not the main issue here. I've attached the smartctl output as separate files. smartmontools 5.40 does not appear to exist; I used 5.39.1, the latest port version. Attached also the vmstat -i and dmesg output. -- Pieter smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.0-RELEASE-p1 i386] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Black family Device Model: WDC WD5001AALS-00L3B2 Serial Number:WD-WCASYA964063 Firmware Version: 01.03B01 User Capacity:500,107,862,016 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Fri May 14 23:01:49 2010 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 241) Self-test routine in progress... 10% of test remaining. Total time to complete Offline data collection: (11160) seconds. Offline data collection capabilities:(0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 131) minutes. Conveyance self-test routine recommended polling time:( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051Pre-fail Always - 78 3 Spin_Up_Time0x0027 184 168 021Pre-fail Always - 3791 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 992 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000Old_age Always - 0 9 Power_On_Hours 0x0032 099 099 000Old_age Always - 827 10 Spin_Retry_Count0x0032 100 100 000Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 990 192 Power-Off_Retract_Count 0x0032 199 199 000Old_age Always - 989 193 Load_Cycle_Count0x0032 200 200 000Old_age Always - 992 194 Temperature_Celsius 0x0022 125 109 000Old_age Always - 22 196 Reallocated_Event_Count 0x0032 200 200 000Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 198 000Old_age Always - 0 198