Re: [zfs-discuss] scrub halts
Will Murnane wrote: On Feb 12, 2008 4:45 AM, Lida Horn [EMAIL PROTECTED] wrote: The latest changes to the sata and marvell88sx modules have been put back to Solaris Nevada and should be available in the next build (build 84). Hopefully, those of you who use it will find the changes helpful. I have indeed found it beneficial. I installed the new drivers on two machines, both of which were intermittently giving errors about device resets. One card did this so often that I believed the card was faulty and I would have to replace either the card or the motherboard. I'm glad you find the new modules useful and am pleased with your results. One thing of which I would like you to be aware is that some of what was done was to suppress the messages. In other words, some of what was happening before is still happening, just silently. Since installing the new drivers I've had no issues whatsoever with drives on either box. I ran zpool scrubs continuously on the flaky box, replaced a disk with another one, and copied data about in an attempt to replicate the bus errors I had previously seen, to no avail. The other box has been similarly stable, as far as I can tell; I see no messages in the logs and the users haven't complained when I asked them. No issues whatsoever, wonderful words to hear! Thank you for the work you've put into improving the state of these drivers; I meant to email you earlier this week and mention the great strides they have made, but other things took precedence. That, to my mind, is the primary evolution these drivers have made: I don't have to worry about my HBAs any more. I appreciate your taking the time to post and hope you have no further issues with the driver. Thank you, Lida Thanks! Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
The latest changes to the sata and marvell88sx modules have been put back to Solaris Nevada and should be available in the next build (build 84). Hopefully, those of you who use it will find the changes helpful. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
On Feb 12, 2008 4:45 AM, Lida Horn [EMAIL PROTECTED] wrote: The latest changes to the sata and marvell88sx modules have been put back to Solaris Nevada and should be available in the next build (build 84). Hopefully, those of you who use it will find the changes helpful. I have indeed found it beneficial. I installed the new drivers on two machines, both of which were intermittently giving errors about device resets. One card did this so often that I believed the card was faulty and I would have to replace either the card or the motherboard. Since installing the new drivers I've had no issues whatsoever with drives on either box. I ran zpool scrubs continuously on the flaky box, replaced a disk with another one, and copied data about in an attempt to replicate the bus errors I had previously seen, to no avail. The other box has been similarly stable, as far as I can tell; I see no messages in the logs and the users haven't complained when I asked them. Thank you for the work you've put into improving the state of these drivers; I meant to email you earlier this week and mention the great strides they have made, but other things took precedence. That, to my mind, is the primary evolution these drivers have made: I don't have to worry about my HBAs any more. Thanks! Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
I now have a improved sata and marvell88sx driver modules that deal with various error conditions in a much more solid way. Changes include reducing the number of required device resets, properly reporting media errors (rather than no additional sense), clearing aborted packets more rapidly so that after an hardware error progress is again made much more quickly. Further the driver is much quieter (far fewer messages in /var/adm/messages). If there is still interest, I can make those binaries available for testing prior to their availability in Solaris Nevada (OpenSolaris). These changes will be checked in soon, but the process always inserts a significant delay, so if anyone would like, please e-mail me and I will make those binaries available via e-mail. Regards, Lida This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
I can confirm that the marvell88sx driver (or kernel 64a) regularly hangs the SATA card (SuperMicro 8-port) with the message about a port being reset. The hang is temporary but troublesome. It can be relieved by turning off NCQ in /etc/system with set sata:sata_func_enable = 0x5 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
I can confirm that the marvell88sx driver (or kernel 64a) regularly hangs the SATA card (SuperMicro 8-port) with the message about a port being reset. The hang is temporary but troublesome. It can be relieved by turning off NCQ in /etc/system with set sata:sata_func_enable = 0x5 Thanks for the info. I'll have to give this a try. BTW, I've verified that this happens on build 70 as well. Gary This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
I can confirm that the marvell88sx driver (or kernel 64a) regularly hangs the SATA card (SuperMicr o 8-port) with the message about a port being reset. The hang is temporary but troublesome. This could be bug 6553517 which was fixed in build 66. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
On Tue, 14 Aug 2007, Richard Elling wrote: Rick Wager wrote: We see similar problems on a SuperMicro with 5 500 GB Seagate sata drives. This is using the AHCI driver. We do not, however, see problems with the same hardware/drivers if we use 250GB drives. Duh. The error is from the disk :-) A likely possiblity is that the disk drives are simply not getting enough (cool) airflow and are over-heating during periods of high system activity that generates a lot of disk head movement; for example, during a zpool scrub. And the extra platters present in the larger disk drives would require even more cooling capacity - which would validate your observations. Best to actually *measure* the effectiveness of the disk cooling design/installation. Recommendation: investigate the Fluke mini infrared thermometers - for example - the Fluke 62 at: http://www.testequipmentdepot.com/fluke/thermometers/62.htm In some disk drive installations, its possible for the infrared probe to see the disk HDA (Head Disk Assembly) without disturbing the drive. PS: I use a much older Fluke 80T-IR in combination with a digital multimeter with millivolt resolution (a Fluke meter of course!). We sometimes see bad blocks reported (are these automatically remapped somehow so they are not used again?) and sometimes sata port resets. Depending on how the errors are reported, the driver may attempt a reset to clear. The drive may also automaticaly spare bad blocks. Here is a sample of the log output. Any help understanding and/or resolving this issue greatly appreciated. I very much don't wont to have freezes in production. Aug 14 11:20:28 chazz1 port 2: device reset Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED],2/[EMAIL PROTECTED],0 (sd3): Aug 14 11:20:28 chazz1 Error for Command: write Error Level: Retryable Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Requested Block: 530 Error Block: 530 Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Vendor: ATA Serial Number: Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Sense Key: No_Additional_Sense Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 This error was transient and retried. If it was a fatal error (still failed after retries) then you'll have another, different message describing the failed condition. -- richard Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Al, That makes so much sense that I can't believe I missed it. One bay was the one giving me the problems. Switching drives didn't affect that. Switching cabling didn't affect that. Changing Sata controllers didn't affect that. However, reorienting the case on it's side did! I'll be putting in a larger fan into the disk-stack case. Gary On Tue, 14 Aug 2007, Richard Elling wrote: Rick Wager wrote: We see similar problems on a SuperMicro with 5 500 GB Seagate sata drives. This is using the AHCI driver. We do not, however, see problems with the same hardware/drivers if we use 250GB drives. Duh. The error is from the disk :-) A likely possiblity is that the disk drives are simply not getting enough (cool) airflow and are over-heating during periods of high system activity that generates a lot of disk head movement; for example, during a zpool scrub. And the extra platters present in the larger disk drives would require even more cooling capacity - which would validate your observations. Best to actually *measure* the effectiveness of the disk cooling design/installation. Recommendation: investigate the Fluke mini infrared thermometers - for example - the Fluke 62 at: http://www.testequipmentdepot.com/fluke/thermometers/6 2.htm In some disk drive installations, its possible for the infrared probe to see the disk HDA (Head Disk Assembly) without disturbing the drive. PS: I use a much older Fluke 80T-IR in combination with a digital multimeter with millivolt resolution (a Fluke meter of course!). We sometimes see bad blocks reported (are these automatically remapped somehow so they are not used again?) and sometimes sata port resets. Depending on how the errors are reported, the driver may attempt a reset to clear. The drive may also automaticaly spare bad blocks. Here is a sample of the log output. Any help understanding and/or resolving this issue greatly appreciated. I very much don't wont to have freezes in production. Aug 14 11:20:28 chazz1 port 2: device reset Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED],2/[EMAIL PROTECTED],0 (sd3): Aug 14 11:20:28 chazz1 Error for Command: write Error Level: Retryable chazz1 scsi: [ID 107833 kern.notice]Requested Block: 530 Error Block: 530 Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Vendor: ATA Serial Number: [ID 107833 kern.notice]Sense Key: No_Additional_Sense Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 This error was transient and retried. If it was a fatal error (still failed after retries) then you'll have another, different message describing the failed condition. -- richard Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT enSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2 007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discu ss This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
We see similar problems on a SuperMicro with 5 500 GB Seagate sata drives. This is using the AHCI driver. We do not, however, see problems with the same hardware/drivers if we use 250GB drives. We sometimes see bad blocks reported (are these automatically remapped somehow so they are not used again?) and sometimes sata port resets. Here is a sample of the log output. Any help understanding and/or resolving this issue greatly appreciated. I very much don't wont to have freezes in production. Aug 14 11:20:28 chazz1 port 2: device reset Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED],2/[EMAIL PROTECTED],0 (sd3): Aug 14 11:20:28 chazz1 Error for Command: write Error Level: Retryable Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Requested Block: 530 Error Block: 530 Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Vendor: ATA Serial Number: Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Sense Key: No_Additional_Sense Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Rick Wager wrote: Thanks Richard! That's the way I read the errors, also, they seem to indicate bad blocks on the drives. The bad news is that when they occur access to the zfs file system stops for quite a long time - seemingly from 30 seconds to a minute or longer. They might be bad blocks, though usually we get more info than no additional sense info. 30 seconds is a typical default retry timeout. The file system will seem to stop because it is ATA and can't handle multiple I/O operations concurrently. Do you have a recommendation for how to identify and map the bad blocks so they are not used again? Should I fill my disk with data in order to identify the bad blocks? The format command has a number of media scan and repair options. Also, for what its worth as I've been running a simple test on my system to copy a large number of files around a zpool in order to fill it up and verify the zpool is working reliably. Just using a simple shell script with very bad results: - In one window the shell script has frozen for about an hour now, the cp command is just hung. - ls of the directories in the zfs file system just hangs - In another window zpool status is also hung and never returns: chazz1## zpool status [11:54:49] pool: fmpool state: ONLINE scrub: scrub completed with 0 errors on Tue Aug 14 12:01:40 2007 ^C^C^C^C^C^C^C^Z Typical reaction for a malfunctioning disk. - And this is the output from zpool iostat chazz1* zpool iostat 10 10 [12:53:01] capacity operationsbandwidth pool used avail read write read write -- - - - - - - fmpool 186G 2.08T131121 15.4M 13.3M fmpool 186G 2.08T 0 0 0 0 fmpool 186G 2.08T 0 0 0 0 fmpool 186G 2.08T 0 0 0 0 fmpool 186G 2.08T 0 0 0 0 fmpool 186G 2.08T 0 0 0 0 I think we'll have to reboot to clear this frozen condition. Any thoughts? According to the ahci man page, the driver does not yet support NCQ, which would also be consistent with the observed behaviour. Do the disks work ok in other machines? -- richard Thanks, Rick Richard Elling wrote: Rick Wager wrote: We see similar problems on a SuperMicro with 5 500 GB Seagate sata drives. This is using the AHCI driver. We do not, however, see problems with the same hardware/drivers if we use 250GB drives. Duh. The error is from the disk :-) We sometimes see bad blocks reported (are these automatically remapped somehow so they are not used again?) and sometimes sata port resets. Depending on how the errors are reported, the driver may attempt a reset to clear. The drive may also automaticaly spare bad blocks. Here is a sample of the log output. Any help understanding and/or resolving this issue greatly appreciated. I very much don't wont to have freezes in production. Aug 14 11:20:28 chazz1 port 2: device reset Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED],2/[EMAIL PROTECTED],0 (sd3): Aug 14 11:20:28 chazz1 Error for Command: write Error Level: Retryable Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Requested Block: 530 Error Block: 530 Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Vendor: ATASerial Number: Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]Sense Key: No_Additional_Sense Aug 14 11:20:28 chazz1 scsi: [ID 107833 kern.notice]ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 This error was transient and retried. If it was a fatal error (still failed after retries) then you'll have another, different message describing the failed condition. -- richard -- Rick Wager email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 303-818-0576 (mobile) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Gary Gendel wrote: Thanks for the information. I am using the Marvell8sx driver on a vanilla Sunfire v20z server. This project has gone through many frustrating phases... Originally I tried a Si3124 board with the box running a 5-1 Sil Sata multiplexer. The controller didn't understand the multiplexer so I put in a second board and drove the drives directly. However, this didn't work well either and would lock up periodically. I added some published driver patches which made things better, but I would still get periodic kernel panics because of a recursive mutex call. So, I bought the Supermicro 8 channel Sata Marvell card. I tried the multiplexer again, but no luck, so I'm driving each separately again. Occasionally, I would have a system lockup and could only force the system to power down. I believe that this may have been due to a flaky Sata connection internal to the box. Now I'm left with the situation that I described. I've had the same issue on my box. It often (somewhere between every 30 minute to 4th hour) resets the sata ports. It continues fine, but it does halt the machine for some time. Recently after I gave up on debugging and took the machine into production it has started to freeze randomly about once per day. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
6564677 oracle datafiles corrupted on thumper wow, must be a huuge database server! :D This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Thanks for the information. I am using the Marvell8sx driver on a vanilla Sunfire v20z server. This project has gone through many frustrating phases... Originally I tried a Si3124 board with the box running a 5-1 Sil Sata multiplexer. The controller didn't understand the multiplexer so I put in a second board and drove the drives directly. However, this didn't work well either and would lock up periodically. I added some published driver patches which made things better, but I would still get periodic kernel panics because of a recursive mutex call. So, I bought the Supermicro 8 channel Sata Marvell card. I tried the multiplexer again, but no luck, so I'm driving each separately again. Occasionally, I would have a system lockup and could only force the system to power down. I believe that this may have been due to a flaky Sata connection internal to the box. Now I'm left with the situation that I described. Gary This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] scrub halts
I've got a 5-500Gb Sata Raid-Z stack running under build 64a. I have two problems that may or may not be interrelated. 1) zpool scrub stops. If I do a zpool status it merrily continues for awhile. I can't see any pattern in this action with repeated scrubs. 2) Bad blocks on one disk. This is repeatable, so I'm sending the disk back for replacement. (1) doesn't seem to correlate to the time I hit the bad blocks, so I don't think this is related. However... When it does hit those blocks, I not only get media sense read errors, but the sata port is dropped and reconnected. I think the driver probably does a port reset, but I figured I'd note it for discussion. Is there a way to remap the bad blocks for zfs? There were only a small number (19) that it hit during the scrub. I'd like to hear some general comments about these issues I'm having with zfs. Thanks, Gary This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Gary Gendel wrote: I've got a 5-500Gb Sata Raid-Z stack running under build 64a. I have two problems that may or may not be interrelated. 1) zpool scrub stops. If I do a zpool status it merrily continues for awhile. I can't see any pattern in this action with repeated scrubs. 2) Bad blocks on one disk. This is repeatable, so I'm sending the disk back for replacement. (1) doesn't seem to correlate to the time I hit the bad blocks, so I don't think this is related. However... When it does hit those blocks, I not only get media sense read errors, but the sata port is dropped and reconnected. I think the driver probably does a port reset, but I figured I'd note it for discussion. Is there a way to remap the bad blocks for zfs? There were only a small number (19) that it hit during the scrub. I'd like to hear some general comments about these issues I'm having with zfs. Are you using the marvell88sx driver to attach your sata disks? If you are, then perhaps this putback from yesterday is what you are in need of: 6564677 oracle datafiles corrupted on thumper http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6564677 James C. McPherson -- Solaris kernel software engineer Sun Microsystems ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
James C. McPherson wrote: Gary Gendel wrote: I've got a 5-500Gb Sata Raid-Z stack running under build 64a. I have two problems that may or may not be interrelated. 1) zpool scrub stops. If I do a zpool status it merrily continues for awhile. I can't see any pattern in this action with repeated scrubs. 2) Bad blocks on one disk. This is repeatable, so I'm sending the disk back for replacement. (1) doesn't seem to correlate to the time I hit the bad blocks, so I don't think this is related. However... When it does hit those blocks, I not only get media sense read errors, but the sata port is dropped and reconnected. I think the driver probably does a port reset, but I figured I'd note it for discussion. Is there a way to remap the bad blocks for zfs? There were only a small number (19) that it hit during the scrub. I'd like to hear some general comments about these issues I'm having with zfs. Are you using the marvell88sx driver to attach your sata disks? If you are, then perhaps this putback from yesterday is what you are in need of: 6564677 oracle datafiles corrupted on thumper http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6564677 This bug and related fix has nothing to do with the zfs scrub issue. Regardless, I would like to know if this is happening with the marvell88sx driver (and if so, what hardware) or with some other driver and hardware. Regards, Lida Horn James C. McPherson -- Solaris kernel software engineer Sun Microsystems ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss