Re: problems with sata disks (taskqueue timeout)
On Sun, 29 Mar 2009 11:01:53 +0200 Marc UBM Bocklet u...@u-boot-man.de wrote: On Tue, 20 Jan 2009 08:08:29 +0100 Marc UBM Bocklet u...@u-boot-man.de wrote: On Tue, 20 Jan 2009 09:39:51 +1100 Andrew Snow and...@modulus.org wrote: I think that if you use eSATA you probably need dedicated eSATA controller ports. eSATA standard specifies a higher voltage for the longer cable distances. Judging from the sporadic problem reports, Promise TX4 is probably not the best at signal purity to begin with so using it for eSATA pushes it over the edge. Hope that helps, Thanks for the fast answer! :-) Although my version of the TX4 has two dedicated e-sata ports, the other posts seem to indicate that it got something to do with the controller (maybe signal purity, like you said). I'll try upgrading next and will report back after that. A very late followup here: I upgraded to the latest stable, but things did not improve: Mar 29 10:57:29 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=1087300992 Mar 29 10:57:34 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Mar 29 10:57:34 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=1087300992 Mar 29 10:57:34 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=1087300992 Mar 29 10:57:34 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=556698042368 size=131072 error=5 Mar 29 10:57:43 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 29 10:57:47 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 29 10:57:51 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Mar 29 10:57:55 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Mar 29 10:57:55 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1087301248 Mar 29 10:57:55 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=1087301248 Mar 29 10:58:00 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Mar 29 10:58:00 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=1087301248 Mar 29 10:58:00 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=556698173440 size=131072 error=5 Any further ideas anybody? :-) Another update, upgrading to -current dating from April 25th 2009 seems to have fixed the problem, I've encountered no errors as of yet and I've copied about 250GB in large chunks, something that was sure to provoke the errors with -stable. FreeBSD xxx 8.0-CURRENT FreeBSD 8.0-CURRENT #1: Sat Apr 25 13:33:18 CEST 2009 xxx:/usr/obj/usr/src/sys/xxx amd64 Bye Marc -- And what rough beast, its hour come round at last, Slouches towards Bethlehem to be born? W.B. Yeats, The Second Coming ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
On Tue, 20 Jan 2009 08:08:29 +0100 Marc UBM Bocklet u...@u-boot-man.de wrote: On Tue, 20 Jan 2009 09:39:51 +1100 Andrew Snow and...@modulus.org wrote: I think that if you use eSATA you probably need dedicated eSATA controller ports. eSATA standard specifies a higher voltage for the longer cable distances. Judging from the sporadic problem reports, Promise TX4 is probably not the best at signal purity to begin with so using it for eSATA pushes it over the edge. Hope that helps, Thanks for the fast answer! :-) Although my version of the TX4 has two dedicated e-sata ports, the other posts seem to indicate that it got something to do with the controller (maybe signal purity, like you said). I'll try upgrading next and will report back after that. A very late followup here: I upgraded to the latest stable, but things did not improve: Mar 29 10:57:29 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=1087300992 Mar 29 10:57:34 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Mar 29 10:57:34 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=1087300992 Mar 29 10:57:34 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=1087300992 Mar 29 10:57:34 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=556698042368 size=131072 error=5 Mar 29 10:57:43 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 29 10:57:47 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 29 10:57:51 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Mar 29 10:57:55 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Mar 29 10:57:55 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1087301248 Mar 29 10:57:55 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=1087301248 Mar 29 10:58:00 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Mar 29 10:58:00 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=1087301248 Mar 29 10:58:00 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=556698173440 size=131072 error=5 Any further ideas anybody? :-) Bye Marc -- And what rough beast, its hour come round at last, Slouches towards Bethlehem to be born? W.B. Yeats, The Second Coming ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
Marc UBM pisze: Hiho! :-) Occasionally, especially when uploading a large number of files, the (brand-new, tested) sata disks in my fileserver spit out some of these errors: --- Jan 19 19:51:14 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882778752 Jan 19 19:51:23 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:27 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:31 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=882778752 Jan 19 19:51:35 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=882778752 Jan 19 19:51:35 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982655488 size=131072 error=5 Jan 19 19:51:41 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Jan 19 19:51:41 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=882779008 Jan 19 19:51:41 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882779008 Jan 19 19:51:50 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:54 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:58 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=882779008 Jan 19 19:52:02 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982786560 size=131072 error=5 --- I've fiddled with the cables, which seemed to help, but I've been unable to completely eliminate the errors. The disks are two Western Digital MyBooks Home Edition (1 TB per disk), connected to a Promise TX 4 SATA Controller: atap...@pci0:1:6:0: class=0x018000 card=0x3d17105a chip=0x3d17105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC40718-GP SATA 300 TX4 Controller' class = mass storage They're connected via 50cm esata cables. I've googled on the net and found some vague hints about problems with the Promise TX4, but nothing concrete. What I've found is http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting basically telling me these things happen, deal with it :-) The problem is, I cannot produce these problems reliably, only thing I notice is that they *seem* to happen more often if a lot of large files are copied in succession. Can anybody tell me if upgrading to 7.2 oder -current will help? I'm currently running 7.0-STABLE-200804 FreeBSD 7.0-STABLE-200804 #0: Wed Dec 10 15:29:03 CET 2008 *...@host:/usr/obj/usr/src/sys/GENERIC amd64 Next step I'll try is upgrading to RELENG_7 to see if that helps. Greetings, Marc Cheers Marc. My personal experience makes me think that this issue is controller/driver related. I'm using SATA 300 TX4 Controller from times of 6.1-Relaese on my fileserver (with 2 of 4 ports used) and I saw a lot of exactly the same errors in logs. Sometimes it was harmless, but sometimes as an effect of these one of disks magically disconnected from controller and only way to get it back and working was power down and up PC. That mostly happened while heavy I/O like while dumping filesystems. Good thing is that starting from 7.0-release I saw such errors maybe 2-3 times and I didn't saw them at all from at least 6 months. Probably because I rebuild my system about once a month to keep up with stable branch and something was corrected in sources through that time. So I also advice to upgrade to RELENG_7 and you probably get rid of these. Good luck! -- Bartosz Stec ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
problems with sata disks (taskqueue timeout)
Hiho! :-) Occasionally, especially when uploading a large number of files, the (brand-new, tested) sata disks in my fileserver spit out some of these errors: --- Jan 19 19:51:14 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882778752 Jan 19 19:51:23 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:27 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:31 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=882778752 Jan 19 19:51:35 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=882778752 Jan 19 19:51:35 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982655488 size=131072 error=5 Jan 19 19:51:41 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Jan 19 19:51:41 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=882779008 Jan 19 19:51:41 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882779008 Jan 19 19:51:50 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:54 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:58 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=882779008 Jan 19 19:52:02 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982786560 size=131072 error=5 --- I've fiddled with the cables, which seemed to help, but I've been unable to completely eliminate the errors. The disks are two Western Digital MyBooks Home Edition (1 TB per disk), connected to a Promise TX 4 SATA Controller: atap...@pci0:1:6:0: class=0x018000 card=0x3d17105a chip=0x3d17105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC40718-GP SATA 300 TX4 Controller' class = mass storage They're connected via 50cm esata cables. I've googled on the net and found some vague hints about problems with the Promise TX4, but nothing concrete. What I've found is http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting basically telling me these things happen, deal with it :-) The problem is, I cannot produce these problems reliably, only thing I notice is that they *seem* to happen more often if a lot of large files are copied in succession. Can anybody tell me if upgrading to 7.2 oder -current will help? I'm currently running 7.0-STABLE-200804 FreeBSD 7.0-STABLE-200804 #0: Wed Dec 10 15:29:03 CET 2008 *...@host:/usr/obj/usr/src/sys/GENERIC amd64 Next step I'll try is upgrading to RELENG_7 to see if that helps. Greetings, Marc ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
problems with sata disks (taskqueue timeout)
Hiho! :-) Occasionally, especially when uploading a large number of files, the (brand-new, tested) sata disks in my fileserver spit out some of these errors: --- Jan 19 19:51:14 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882778752 Jan 19 19:51:23 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:27 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:31 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=882778752 Jan 19 19:51:35 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=882778752 Jan 19 19:51:35 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982655488 size=131072 error=5 Jan 19 19:51:41 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Jan 19 19:51:41 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=882779008 Jan 19 19:51:41 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882779008 Jan 19 19:51:50 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:54 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:58 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=882779008 Jan 19 19:52:02 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982786560 size=131072 error=5 --- I've fiddled with the cables, which seemed to help, but I've been unable to completely eliminate the errors. The disks are two Western Digital MyBooks Home Edition (1 TB per disk), connected to a Promise TX 4 SATA Controller: atap...@pci0:1:6:0: class=0x018000 card=0x3d17105a chip=0x3d17105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC40718-GP SATA 300 TX4 Controller' class = mass storage They're connected via 50cm esata cables. I've googled on the net and found some vague hints about problems with the Promise TX4, but nothing concrete. What I've found is http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting basically telling me these things happen, deal with it :-) The problem is, I cannot produce these problems reliably, only thing I notice is that they *seem* to happen more often if a lot of large files are copied in succession. Can anybody tell me if upgrading to 7.2 oder -current will help? I'm currently running 7.0-STABLE-200804 FreeBSD 7.0-STABLE-200804 #0: Wed Dec 10 15:29:03 CET 2008 *...@host:/usr/obj/usr/src/sys/GENERIC amd64 Next step I'll try is upgrading to RELENG_7 to see if that helps. Greetings, Marc -- Marc UBM Bocklet ubm.free...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
I think that if you use eSATA you probably need dedicated eSATA controller ports. eSATA standard specifies a higher voltage for the longer cable distances. Judging from the sporadic problem reports, Promise TX4 is probably not the best at signal purity to begin with so using it for eSATA pushes it over the edge. Hope that helps, - Andrew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
I've fiddled with the cables, which seemed to help, but I've been unable to completely eliminate the errors. The disks are two Western Digital MyBooks Home Edition (1 TB per disk), connected to a Promise TX 4 SATA Controller: atap...@pci0:1:6:0: class=0x018000 card=0x3d17105a chip=0x3d17105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC40718-GP SATA 300 TX4 Controller' class = mass storage I have a similar setup, same card, two WD disks. On 6.3 it was affected by the problem you mention, but when I moved to 7.0, it disapeared. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
On Mon, 19 Jan 2009, Marc UBM wrote: Hiho! :-) Occasionally, especially when uploading a large number of files, the (brand-new, tested) sata disks in my fileserver spit out some of these errors: I've found that those kind of errors are very, very controller-dependent. Case in point - a 4-disk raidz on an ASUS board with a VIA SATA controller. The drives were attached to a highpoint rocketraid controller, then the data was moved off and the drives attached to the VIA controller. As soon as the raidz was created and data was being copied back to the array, taskqueue errors. So, back to the highpoint controller. Swapped out the board for another ASUS, but this time with the Q35 / ICH9 controller. No a single problem whatsoever. --- Jan 19 19:51:14 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882778752 Jan 19 19:51:23 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:27 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:31 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:51:35 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=882778752 Jan 19 19:51:35 hamstor kernel: ad10: FAILURE - WRITE_DMA48 status=ffBUSY,READY,DMA_READY,DSC,DRQ,CORRECTABLE,INDEX,ERROR error=ffICRC,UNCORRECTABLE,MEDIA_CHANGED,NID_NOT_FOUND,MEDIA_CHANGE_REQEST,ABORTED,NO_MEDIA,ILLEGAL_LENGTH LBA=882778752 Jan 19 19:51:35 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982655488 size=131072 error=5 Jan 19 19:51:41 hamstor kernel: ad10: FAILURE - SET_MULTI status=51READY,DSC,ERROR error=4ABORTED Jan 19 19:51:41 hamstor kernel: ad10: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=882779008 Jan 19 19:51:41 hamstor kernel: ad10: WARNING - WRITE_DMA48 UDMA ICRC error (retrying request) LBA=882779008 Jan 19 19:51:50 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:54 hamstor kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Jan 19 19:51:58 hamstor kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly Jan 19 19:52:02 hamstor kernel: ad10: FAILURE - WRITE_DMA48 timed out LBA=882779008 Jan 19 19:52:02 hamstor root: ZFS: vdev I/O failure, zpool=gedaerm path=/dev/ad10 offset=451982786560 size=131072 error=5 --- I've fiddled with the cables, which seemed to help, but I've been unable to completely eliminate the errors. The disks are two Western Digital MyBooks Home Edition (1 TB per disk), connected to a Promise TX 4 SATA Controller: atap...@pci0:1:6:0: class=0x018000 card=0x3d17105a chip=0x3d17105a rev=0x02 hdr=0x00 vendor = 'Promise Technology Inc' device = 'PDC40718-GP SATA 300 TX4 Controller' class = mass storage They're connected via 50cm esata cables. I've googled on the net and found some vague hints about problems with the Promise TX4, but nothing concrete. What I've found is http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting basically telling me these things happen, deal with it :-) The problem is, I cannot produce these problems reliably, only thing I notice is that they *seem* to happen more often if a lot of large files are copied in succession. Can anybody tell me if upgrading to 7.2 oder -current will help? I'm currently running 7.0-STABLE-200804 FreeBSD 7.0-STABLE-200804 #0: Wed Dec 10 15:29:03 CET 2008 *...@host:/usr/obj/usr/src/sys/GENERIC amd64 Next step I'll try is upgrading to RELENG_7 to see if that helps. Greetings, Marc ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with sata disks (taskqueue timeout)
On Tue, 20 Jan 2009 09:39:51 +1100 Andrew Snow and...@modulus.org wrote: I think that if you use eSATA you probably need dedicated eSATA controller ports. eSATA standard specifies a higher voltage for the longer cable distances. Judging from the sporadic problem reports, Promise TX4 is probably not the best at signal purity to begin with so using it for eSATA pushes it over the edge. Hope that helps, Thanks for the fast answer! :-) Although my version of the TX4 has two dedicated e-sata ports, the other posts seem to indicate that it got something to do with the controller (maybe signal purity, like you said). I'll try upgrading next and will report back after that. Bye Marc ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SETFEATURES SET TRANSFER MODE taskqueue timeout.. Error occuring constantly.. Please help!!
I have made some changes, and provided requested details. Issue is still occuring, so if it looks like it's going to be more trouble than it's worth I will probably just replace 3 of the PATA IDE disks with a SATA disk and just throw the remaining PATA on the Nvidia ATA controller? Thanks for your help thus far! :) On Sun, Oct 19, 2008 at 8:25 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Sun, Oct 19, 2008 at 03:32:29AM +1100, Kristian Rooke wrote: Thanks for the quick response! Please see requested output below: Cool, thanks. One thing I forgot to ask for was vmstat -i output. interrupt total rate irq1: atkbd0 6 0 irq6: fdc0 1 0 irq14: ata0 2060 2 irq16: atapci1 612 0 irq17: em0 810 0 cpu0: timer 1812646 1998 cpu1: timer 1812344 1998 Total3628479 4000 For now, let's break it down for ease of understanding: FreeBSD 7.0-RELEASE i386, built February 2008. atapci0: nVidia nForce MCP73 ATA133 controller -- IRQ 14 atapci1: Silicon Image 0680 ATA133 controller -- IRQ 16 ata0: attached to atapci0 ata1: attached to atapci0 ata2: attached to atapci1 ata3: attached to atapci1 ad0: Seagate ST380011A 3.06 at ata0-master PIO4 ad4: Seagate ST3320620A 3.AAF at ata2-master PIO4 ad5: Seagate ST3320620A 3.AAF at ata2-slave PIO4 ad6: Seagate ST3750640A 3.AAE at ata3-master PIO4 ad7: Seagate ST3320620A 3.AAD at ata3-slave PIO4 ATA errors are reported for disks ad4, ad5, ad6, and ad7. ad0 appears to be error-free. First and foremost: there are known problems with Silicon Image controllers on all operating systems (Windows, Linux, and FreeBSD in particular), known for causing data loss and other sporadic issues. This is at least confirmed on their SATA controllers, and I've become quite the pick something else advocate when it comes to their stuff. However: I've no idea about their PATA controllers. I was originally using a Promise PATA IDE controller, but that's when the issues first began so I bought a cheap Silicon Image IDE controller to replace it. After reading your email I have replaced the SI card with the Promise controller. Below is the detail from dmesg: atapci1: Promise PDC20270 UDMA100 controller port 0xcf00-0xcf07,0xce00-0xce03,0xcd00-0xcd07,0xcc00-0xcc03,0xcb00-0xcb0f mem 0xefbf-0xefbf irq 16 at device 5.0 on pci1 Secondly, so far there isn't any evidence that the ad0 disk, which uses the nVidia controller, has any problem -- all the disks having problems are on the Silicon Image controller. That is a very key piece of information here. If when you're writing data to, say, the ad4 disk, and you start to see errors on all disks (ad4 through ad7), then what this probably means is the controller has locked up or is behaving badly. This adds further evidence that the Silicon Image controller may be at fault here. Thirdly, you said the system requires a hard reset to get things back in working order. Sometimes this can be induced by a power supply that isn't providing decent/proper voltages, or is being overloaded, particularly during heavy disk I/O (drawing more power in some cases). It might be good to check your voltages inside of your system BIOS, write them down, and type them in here. FreeBSD does not provide a decent set of tools for monitoring this stuff inside the OS (yet; I'm working on it, mainly for server boards. I do what I can...) When error messages (same as pasted previously) begin being displayed in console, the system becomes unresponsive. I can no longer SSH to the device, and when I attempt to use it via console it simply continues to constantly scroll the disk error messages. I am currently using an Anter 550w PSU. Below are the Voltage details from BIOS: Vcore - 1.19V Vcc12V - 12.30V Vcc3.3V - 3.28V Vcc5.0V - 5.04V But keep in mind that a controller locking up hard could also require a hard reset (pressing reset on the front of the PC) -- a soft reset (Ctrl-Alt-Del) would probably work, except much of the running kernel is spinning hard trying to deal with ATA problems. Fourthly, I see a some output omitted line in your original dmesg. Can you provide that output? It's important -- sometimes people have seen issues where their ATA controller shows problems, but it turns out to be an IRQ sharing or device compatibility problem with another device (e.g. their board was showing ATA errors, but at the exact same time, also showing NIC watchdog timeouts or other anomalies). They omitted the dmesg data thinking it had nothing to do with the problem, when in fact it helps determine if the issue is truly with one piece or the entire system. The some output omitted was simply repeats of error messages I previously
SETFEATURES SET TRANSFER MODE taskqueue timeout.. Error occuring constantly.. Please help!!
Hi, I have a PC/box with 5 disks in it that I am using as a fileserver and I recently upgraded some hardware and installed FreeBSD 7.0-RELEASE. Previously I had a RAID PATA IDE controller on the motherboard (was not using RAID functionality though), but I when I upgraded I had to use a PCI IDE controller, due to the lack of PATA ports on the new motherboard. Now when I am attempting to write files, or do anything more than just browse filesystems on the drives ad4-ad7, I get multiple occurrences of the errors below. After these errors occur the kernel panics and I need to perform a hard reset to get the server back up again. Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: FAILURE - WRITE_DMA timed out LBA=163323135 Sep 28 11:40:28 FileServer kernel: g_vfs_done():ad6s1[WRITE(offset=83621412864, length=16384)]error = 5 some output omitted Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: FAILURE - WRITE_DMA timed out LBA=287 I have taken a read through some previous conversations, but I can't seem to find the answers I'm looking for. I've changed the IDE cables and the PATA controller and it is still not making any difference. I also added hw.ata.ata_dma=0 to /boot/loader.conf as recommended in a wiki I came across, but if anything it made the issue even worse. Can someone please help? Thanks, Kristian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SETFEATURES SET TRANSFER MODE taskqueue timeout.. Error occuring constantly.. Please help!!
On Sat, Oct 18, 2008 at 07:00:42PM +1100, Kristian Rooke wrote: Hi, I have a PC/box with 5 disks in it that I am using as a fileserver and I recently upgraded some hardware and installed FreeBSD 7.0-RELEASE. Previously I had a RAID PATA IDE controller on the motherboard (was not using RAID functionality though), but I when I upgraded I had to use a PCI IDE controller, due to the lack of PATA ports on the new motherboard. Now when I am attempting to write files, or do anything more than just browse filesystems on the drives ad4-ad7, I get multiple occurrences of the errors below. After these errors occur the kernel panics and I need to perform a hard reset to get the server back up again. Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: FAILURE - WRITE_DMA timed out LBA=163323135 Sep 28 11:40:28 FileServer kernel: g_vfs_done():ad6s1[WRITE(offset=83621412864, length=16384)]error = 5 some output omitted Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: FAILURE - WRITE_DMA timed out LBA=287 I have taken a read through some previous conversations, but I can't seem to find the answers I'm looking for. I've changed the IDE cables and the PATA controller and it is still not making any difference. I also added hw.ata.ata_dma=0 to /boot/loader.conf as recommended in a wiki I came across, but if anything it made the issue even worse. Can someone please help? Tracking these problems down takes a lot of time. I hope you have the time. :-) Can you please provide the following output: # dmesg # pciconf -lv # atacontrol list Also, please install ports/sysutils/smartmontools (version 5.38 or newer), and provide output for the following commands: # smartctl -a /dev/ad4 # smartctl -a /dev/ad5 # smartctl -a /dev/ad6 # smartctl -a /dev/ad7 Thanks. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SETFEATURES SET TRANSFER MODE taskqueue timeout.. Error occuring constantly.. Please help!!
revision number 1 SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 100 Not_testing 200 Not_testing 300 Not_testing 400 Not_testing 500 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. === On Sat, Oct 18, 2008 at 9:24 PM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Sat, Oct 18, 2008 at 07:00:42PM +1100, Kristian Rooke wrote: Hi, I have a PC/box with 5 disks in it that I am using as a fileserver and I recently upgraded some hardware and installed FreeBSD 7.0-RELEASE. Previously I had a RAID PATA IDE controller on the motherboard (was not using RAID functionality though), but I when I upgraded I had to use a PCI IDE controller, due to the lack of PATA ports on the new motherboard. Now when I am attempting to write files, or do anything more than just browse filesystems on the drives ad4-ad7, I get multiple occurrences of the errors below. After these errors occur the kernel panics and I need to perform a hard reset to get the server back up again. Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad7: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad6: FAILURE - WRITE_DMA timed out LBA=163323135 Sep 28 11:40:28 FileServer kernel: g_vfs_done():ad6s1[WRITE(offset=83621412864, length=16384)]error = 5 some output omitted Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: WARNING - SET_MULTI taskqueue timeout - completing request directly Sep 28 11:40:28 FileServer kernel: ad5: FAILURE - WRITE_DMA timed out LBA=287 I have taken a read through some previous conversations, but I can't seem to find the answers I'm looking for. I've changed the IDE cables and the PATA controller and it is still not making any difference. I also added hw.ata.ata_dma=0 to /boot/loader.conf as recommended in a wiki I came across, but if anything it made the issue even worse. Can someone please help? Tracking these problems down takes a lot of time. I hope you have the time. :-) Can you please provide the following output: # dmesg # pciconf -lv # atacontrol list Also, please install ports/sysutils/smartmontools (version 5.38 or newer), and provide output for the following commands: # smartctl -a /dev/ad4 # smartctl -a /dev/ad5 # smartctl -a /dev/ad6 # smartctl -a /dev/ad7 Thanks. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking
Re: SETFEATURES SET TRANSFER MODE taskqueue timeout.. Error occuring constantly.. Please help!!
On Sun, Oct 19, 2008 at 03:32:29AM +1100, Kristian Rooke wrote: Thanks for the quick response! Please see requested output below: Cool, thanks. One thing I forgot to ask for was vmstat -i output. For now, let's break it down for ease of understanding: FreeBSD 7.0-RELEASE i386, built February 2008. atapci0: nVidia nForce MCP73 ATA133 controller -- IRQ 14 atapci1: Silicon Image 0680 ATA133 controller -- IRQ 16 ata0: attached to atapci0 ata1: attached to atapci0 ata2: attached to atapci1 ata3: attached to atapci1 ad0: Seagate ST380011A 3.06 at ata0-master PIO4 ad4: Seagate ST3320620A 3.AAF at ata2-master PIO4 ad5: Seagate ST3320620A 3.AAF at ata2-slave PIO4 ad6: Seagate ST3750640A 3.AAE at ata3-master PIO4 ad7: Seagate ST3320620A 3.AAD at ata3-slave PIO4 ATA errors are reported for disks ad4, ad5, ad6, and ad7. ad0 appears to be error-free. First and foremost: there are known problems with Silicon Image controllers on all operating systems (Windows, Linux, and FreeBSD in particular), known for causing data loss and other sporadic issues. This is at least confirmed on their SATA controllers, and I've become quite the pick something else advocate when it comes to their stuff. However: I've no idea about their PATA controllers. Secondly, so far there isn't any evidence that the ad0 disk, which uses the nVidia controller, has any problem -- all the disks having problems are on the Silicon Image controller. That is a very key piece of information here. If when you're writing data to, say, the ad4 disk, and you start to see errors on all disks (ad4 through ad7), then what this probably means is the controller has locked up or is behaving badly. This adds further evidence that the Silicon Image controller may be at fault here. Thirdly, you said the system requires a hard reset to get things back in working order. Sometimes this can be induced by a power supply that isn't providing decent/proper voltages, or is being overloaded, particularly during heavy disk I/O (drawing more power in some cases). It might be good to check your voltages inside of your system BIOS, write them down, and type them in here. FreeBSD does not provide a decent set of tools for monitoring this stuff inside the OS (yet; I'm working on it, mainly for server boards. I do what I can...) But keep in mind that a controller locking up hard could also require a hard reset (pressing reset on the front of the PC) -- a soft reset (Ctrl-Alt-Del) would probably work, except much of the running kernel is spinning hard trying to deal with ATA problems. Fourthly, I see a some output omitted line in your original dmesg. Can you provide that output? It's important -- sometimes people have seen issues where their ATA controller shows problems, but it turns out to be an IRQ sharing or device compatibility problem with another device (e.g. their board was showing ATA errors, but at the exact same time, also showing NIC watchdog timeouts or other anomalies). They omitted the dmesg data thinking it had nothing to do with the problem, when in fact it helps determine if the issue is truly with one piece or the entire system. Next, let's take a look at your SMART output, which tells a tale of something very very bad: Disk ad4 has a good temperature, and no sign of bad blocks/sectors. The disk had been powered on for a total of 7799 hours. There was a CRC error detected when attempting to set specific capabilities on the device. The error occurred at LBA 0 on the disk, which is completely bizarre, but the SMART error log might just say LBA 0 to indicate no LBA was being accessed (e.g. the error was purely during the mode setting attempts). However, the SMART error wraps its timestamps at 49.710 days (every 1149.840 hours), so it's going to be difficult to determine if the below SMART error log entry was from long ago, or was fairly recent. Looking at other disks might help, so let's continue. Disk ad5 has an excellent temperature, and no sign of bad blocks/sectors either. The disk has been powered on for a total of 11956 hours. No errors were found in the SMART log. Disk ad6 has a good temperature, and no sign of bad blocks/sectors. No errors were found in the SMART log. Disk ad7 has an excellent temperature, and no sign of bad blocks/sectors either. The disk had been powered on for a total of 12512 hours. However, much like disk ad4, this disk also witnessed a CRC error when attempting to either do a DMA read operation or when setting capabilities on the device. I'm prone to believe it's when setting capabilities, because LBA 0 is also seen here, which isn't a likely LBA. This error happened at the 6310 hour mark, which was about half of its lifetime ago. All of this is somewhat of a mystery. Disk ad4 is on a completely different physical cable than disk ad7, so that *could* rule out cabling problems. The errors seen are only when setting device capabilities (making an educated guess, but I'm not 100% positive), not
Re: taskqueue timeout
Steve Bertrand wrote: I'm wondering if the problems described in the following link have been resolved: http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2008-02/msg00211.html I've got four 500GB SATA disks in a ZFS raidz pool, and all four of them are experiencing the behavior. Thanks to all who have provided patches off list. Unfortunately, none of them helped. The only other box I have with four SATA ports on it is my actual workstation. The board is ASUS P5GD1, and has an Intel 82801FR SATA controller. I despise the thought that if this works, I'll have to rebuild my workstation, but heres to sacrificing my Windows PC in the name of ruling out the problem. In the meantime, can anyone provide any feedback on the board I mentioned in regards to FreeBSD? Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout [SOLVED]
Steve Bertrand wrote: The only other box I have with four SATA ports on it is my actual workstation. The board is ASUS P5GD1, and has an Intel 82801FR SATA controller. I transferred the SATA disks to the above board, loaded up the zpool, and I can not reproduce the problem :) Currently, for the last 15 minutes, I'm writing 80MB/s to the zpool with no problems. Thanks all, Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
taskqueue timeout
Hi everyone, I'm wondering if the problems described in the following link have been resolved: http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2008-02/msg00211.html I've got four 500GB SATA disks in a ZFS raidz pool, and all four of them are experiencing the behavior. The problem only happens with extreme disk activity. The box becomes unresponsive (can not SSH etc). Keyboard input is displayed on the console, but the commands are not accepted. Is there anything I can do to either figure this out, or work around it? Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
:Hi everyone, : :I'm wondering if the problems described in the following link have been :resolved: : :http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2008-02/msg00211.html : :I've got four 500GB SATA disks in a ZFS raidz pool, and all four of them :are experiencing the behavior. : :The problem only happens with extreme disk activity. The box becomes :unresponsive (can not SSH etc). Keyboard input is displayed on the :console, but the commands are not accepted. : :Is there anything I can do to either figure this out, or work around it? : :Steve If you are getting DMA timeouts, go to this URL: http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting Then I would suggest going into /usr/src/sys/dev/ata (I think, on FreeBSD), locate all instances where request-timeout is set to 5, and change them all to 10. cd /usr/src/sys/dev/ata fgrep 'request-timeout' *.c ... change all assignments of 5 to 10 ... Try that first. If it helps then it is a known issue. Basically a combination of the on-disk write cache and possible ECC corrections, remappings, or excessive remapped sectors can cause the drive to take much longer then normal to complete a request. The default 5-second timeout is insufficient. If it does help, post confirmation to prod the FBsd developers to change the timeouts. -- If you are NOT getting DMA timeouts then the ZFS lockups may be due to buffer/memory deadlocks. ZFS has knobs for adjusting its memory footprint size. Lowering the footprint ought to solve (most of) those issues. It's actually somewhat of a hard issue to solve. Filesystems like UFS aren't complex enough to require the sort of dynamic memory allocations deep in the filesystem that ZFS and HAMMER need to do. -Matt Matthew Dillon [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Matthew Dillon wrote: If you are getting DMA timeouts, go to this URL: Yes, I am. http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting I fall under the category of ATA/SATA DMA timeout issues. Then I would suggest going into /usr/src/sys/dev/ata (I think, on FreeBSD), locate all instances where request-timeout is set to 5, and change them all to 10. cd /usr/src/sys/dev/ata fgrep 'request-timeout' *.c ... change all assignments of 5 to 10 ... Try that first. If it helps then it is a known issue. Basically a combination of the on-disk write cache and possible ECC corrections, remappings, or excessive remapped sectors can cause the drive to take much longer then normal to complete a request. The default 5-second timeout is insufficient. If it does help, post confirmation to prod the FBsd developers to change the timeouts. I've just reproduced the problem, and will try hacking the code now to see if the problem goes away. Since the box won't take input, I can't tell the disk usage at the time it dies. However, it seems to appear while running an Amanda backup, and my network throughput hits about ~90 Mbps @ ~5 kpps. I'll post back with results of the increase of the timeout. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Matthew Dillon wrote: If you are getting DMA timeouts, go to this URL: http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting Then I would suggest going into /usr/src/sys/dev/ata (I think, on FreeBSD), locate all instances where request-timeout is set to 5, and change them all to 10. cd /usr/src/sys/dev/ata fgrep 'request-timeout' *.c ... change all assignments of 5 to 10 ... Changing 5 to 10 in all cases and rebuilding the kernel does not fix the problem. I'm going to install the patch that allows the values to be changed via sysctl and up it to 15. This problem happens across all four disks. Does anyone else have any suggestions on what I can check? Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Steve Bertrand wrote: Matthew Dillon wrote: If you are getting DMA timeouts, go to this URL: http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting Then I would suggest going into /usr/src/sys/dev/ata (I think, on FreeBSD), locate all instances where request-timeout is set to 5, and change them all to 10. cd /usr/src/sys/dev/ata fgrep 'request-timeout' *.c ... change all assignments of 5 to 10 ... Changing 5 to 10 in all cases and rebuilding the kernel does not fix the problem. Went from 10-15, and it took quite a bit longer into the backup before the problem cropped back up. Here is what I was seeing at the time it failed. Where netstat and zpool iostat drop off is where I start seeing the errors occur: # top last pid: 1069; load averages: 0.09, 0.17, 0.10 up 0+00:08:31 19:22:39 53 processes: 1 running, 52 sleeping CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 28M Active, 3644K Inact, 301M Wired, 76K Cache, 1634M Free Swap: # netstat -w 1 -h 4.8K 011M 3.5K 0 5.4M 0 4.5K 010M 3.3K 0 5.1M 0 4.9K 011M 3.6K 0 5.5M 0 4.8K 011M 3.5K 0 5.4M 0 4.3K 0 9.5M 3.1K 0 4.8M 0 5.1K 011M 3.7K 0 5.7M 0 5.0K 011M 3.6K 0 5.6M 0 5.3K 012M 3.9K 0 6.0M 0 4.8K 011M 3.5K 0 5.4M 0 4.7K 010M 3.4K 0 5.2M 0 4.8K 011M 3.5K 0 5.4M 0 4.6K 010M 3.4K 0 5.2M 0 4.1K 0 9.1M 3.0K 0 4.6M 0 5.3K 012M 3.9K 0 6.0M 0 5.2K 012M 3.8K 0 5.8M 0 4.3K 0 9.5M 3.1K 0 4.8M 0 4.3K 0 9.6M 3.2K 0 4.9M 0 5.4K 012M 4.0K 0 6.1M 0 4.8K 011M 3.5K 0 5.4M 0 2.4K 0 5.1M 1.7K 0 2.5M 0 input(Total) output packets errs bytespackets errs bytes colls 2 0120 2 0316 0 3 0180 4 0 1.0K 0 3 0180 2 0316 0 3 0180 3 0658 0 5 0 1.6K 5 0942 0 3 0254 4 0840 0 3 0180 2 0316 0 # zpool iostat 1 storage 6.40G 1.81T 0296 0 37.0M storage 6.43G 1.81T 0188 0 14.5M storage 6.43G 1.81T 0 0 0 0 storage 6.43G 1.81T 0 0 0 0 storage 6.43G 1.81T 0 0 0 0 storage 6.43G 1.81T 0 47 0 5.99M storage 6.46G 1.81T 0218 0 18.0M storage 6.46G 1.81T 0 0 0 0 storage 6.46G 1.81T 0 0 0 0 storage 6.46G 1.81T 9 0 192K 0 storage 6.46G 1.81T 0 59 0 7.39M storage 6.49G 1.81T 1250 3.42K 14.9M storage 6.49G 1.81T 0 0 0 0 storage 6.49G 1.81T 0 0 0 0 storage 6.49G 1.81T 0 0 0 0 storage 6.49G 1.81T 0141 0 17.5M storage 6.52G 1.81T 0 74 0 232K storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0151 0 18.8M storage 6.52G 1.81T 0114 0 8.07M storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0 0 0 0 storage 6.52G 1.81T 0 0 0 0 Don't know if this will help anyone or not. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
:Went from 10-15, and it took quite a bit longer into the backup before :the problem cropped back up. Try 30 or longer. See if you can make the problem go away entirely. then fall back to 5 and see if the problem resumes at its earlier pace. -- It could be temperature related. The drives are being exercised a lot, they could very well be overheating. To find out add more airflow (a big house fan would do the trick). -- It could be that errors are accumulating on the drives, but it seems unlikely that four drives would exhibit the same problem. -- Also make sure the power supply can handle four drives. Most power supplies that come with consumer boxes can't under full load if you also have a mid or high-end graphics card installed. Power supplies that come with OEM slap-together enclosures are not usually much better. Specifically, look at the +5V and +12V amperage maximums on the power supply, then check the disk labels to see what they draw, then multiply by 2. e.g. if your power supply can do [EMAIL PROTECTED] and you have four drives each taking [EMAIL PROTECTED] (and typically ~half that at 5V), thats 4x2x2 = [EMAIL PROTECTED] and you would probably be ok. To test, remove two of the four drives, reformat the ZFS to use just 2, and see if the problem reoccurs with just two drives. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Don't want to give conflicting advice, and would suggest you certainly try the 30 sec thing first. I'm already on 10 myself but haven't pushed further. In my own case I've not had any issue with zfs in particular since I applied the ZFS zil/prefetch disable loader.conf tunables 10 hours ago. I am observing this now. For the record .. What ata chipset/motherboard and model of disk have you got ? Have you seen any smart errors (real or otherwise) ? What do your 'zpool status' counters look like ? -- Alex On Tue, 2008-07-15 at 12:55 -0700, Matthew Dillon wrote: :Went from 10-15, and it took quite a bit longer into the backup before :the problem cropped back up. Try 30 or longer. See if you can make the problem go away entirely. then fall back to 5 and see if the problem resumes at its earlier pace. -- It could be temperature related. The drives are being exercised a lot, they could very well be overheating. To find out add more airflow (a big house fan would do the trick). -- It could be that errors are accumulating on the drives, but it seems unlikely that four drives would exhibit the same problem. -- Also make sure the power supply can handle four drives. Most power supplies that come with consumer boxes can't under full load if you also have a mid or high-end graphics card installed. Power supplies that come with OEM slap-together enclosures are not usually much better. Specifically, look at the +5V and +12V amperage maximums on the power supply, then check the disk labels to see what they draw, then multiply by 2. e.g. if your power supply can do [EMAIL PROTECTED] and you have four drives each taking [EMAIL PROTECTED] (and typically ~half that at 5V), thats 4x2x2 = [EMAIL PROTECTED] and you would probably be ok. To test, remove two of the four drives, reformat the ZFS to use just 2, and see if the problem reoccurs with just two drives. -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: taskqueue timeout
Matthew Dillon wrote: Try that first. If it helps then it is a known issue. Basically a combination of the on-disk write cache and possible ECC corrections, remappings, or excessive remapped sectors can cause the drive to take much longer then normal to complete a request. The default 5-second timeout is insufficient. From Western Digital's line of enterprise drives: RAID-specific time-limited error recovery (TLER) - Pioneered by WD, this feature prevents drive fallout caused by the extended hard drive error-recovery processes common to desktop drives. Western Digital's information sheet on TLER states that they found most RAID controllers will wait 8 seconds for a disk to respond before dropping it from the RAID set. Consequently they changed their enterprise drives to try reading a bad sector for only 7 seconds before returning an error. Therefore I think the FreeBSD timeout should also be set to 8 seconds instead of 5 seconds. Desktop-targetted drives will not respond for over 10 seconds, up to minutes, so its not worth setting the FreeBSD timeout any higher. More info: http://www.wdc.com/en/library/sata/2579-001098.pdf http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery - Andrew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Matthew Dillon wrote: :Went from 10-15, and it took quite a bit longer into the backup before :the problem cropped back up. Jumping right into it, there is another post after this one, but I'm going to try to reply inline: Try 30 or longer. See if you can make the problem go away entirely. then fall back to 5 and see if the problem resumes at its earlier pace. I'm sure 30 will either push the issue longer, or into non-existence, but are there any developers here who can say what this timer does? ie. How does changing this timer affect the performance of the disk subsystem (aside from allowing it to work, of course). After I'm done responding this message, I'll be testing the sysctl to 30. It could be temperature related. The drives are being exercised a lot, they could very well be overheating. To find out add more airflow (a big house fan would do the trick). Temperature is a good thought, but currently, my physical situation has this: - 2U chassis - multiple fans in the case - in my lab (which is essentially beside my desk) - the case has no lid - it is 64 degrees with A/C and circulating fans in this area - hard drives are separated relatively well inside the case It could be that errors are accumulating on the drives, but it seems unlikely that four drives would exhibit the same problem. Thats what I'm thinking. All four drives are exhibiting the same errors... or, for all intents and purposes, the machine is coughing the same errors for all the drives. Also make sure the power supply can handle four drives. Most power supplies that come with consumer boxes can't under full load if you also have a mid or high-end graphics card installed. Power supplies that come with OEM slap-together enclosures are not usually much better. I currently have a 550W PSU in the 2U chassis, which again, is sitting open. I have more hardware, running in worse conditions with less wattage PSUs that don't exhibit this behavior. I need to determine whether this problem is SATA, ZFS, the motherboard or code. Specifically, look at the +5V and +12V amperage maximums on the power supply, then check the disk labels to see what they draw, then multiply by 2. e.g. if your power supply can do [EMAIL PROTECTED] and you have four drives each taking [EMAIL PROTECTED] (and typically ~half that at 5V), thats 4x2x2 = [EMAIL PROTECTED] and you would probably be ok. I'm well within specs. Even after V/A tests with the meter. The power supply is providing ample wattage to each device accordingly. To test, remove two of the four drives, reformat the ZFS to use just 2, and see if the problem reoccurs with just two drives. ... I knew that was going to come up... my response is I worked so hard to get this system with ZFS all configured *exactly* how I wanted it. To test, I'm going to flip to 30 as per Matthews recommendation, and see how far that takes me. At this time, I'm only testing by backing up one machine on the network. If it fails, I'll clock the time, and then 'reformat' with two drives. Is there a technical reason this may work better with only two drives? Is there anyone interested to the point where remote login would be helpful? Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
On Tue, Jul 15, 2008 at 10:29:28PM -0400, Steve Bertrand wrote: Is there anyone interested to the point where remote login would be helpful? I believe my FreeBSD Wiki page documents what to do if your problem is easily reproducable: contact Scott Long, who has offered to help track down the source of these problems. I'll reply to the other part of your mail in a bit. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Alex Trull wrote: Don't want to give conflicting advice, and would suggest you certainly try the 30 sec thing first. I'm already on 10 myself but haven't pushed further. What were you doing, and what did you notice when the problem started? As much as it seems silly, I'm mostly interested in what your network was doing at the time things went sour. In my own case I've not had any issue with zfs in particular since I applied the ZFS zil/prefetch disable loader.conf tunables 10 hours ago. I am observing this now. For some reason, and with no explanation or science behind it, I don't think this is a ZFS problem, and I'm trying to defend this thought to my peers until I prove otherwise. I have to be a bit careful on how I adjust loader properties, given that I'm loading from USB, and mounting root from a ZFS zpool hard disk. Like my GELI systems, tweaking things can be a bit touchy unless I put a little more planning into it. For the record .. What ata chipset/motherboard and model of disk have you got ? I'm not a hardware person per-se, but I'm advised to post that the motherboard is: - XFS nForce 610i with GeForce 7050 If there is more hardware info I can provide, let me know specifically what I should be looking for. Have you seen any smart errors (real or otherwise) ? What do your 'zpool status' counters look like ? zpool status is always clean. There are no errors otherwise, even if the box is up for multiple hours straight. The problem occurs only if I through work at it. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Jeremy Chadwick wrote: On Tue, Jul 15, 2008 at 10:29:28PM -0400, Steve Bertrand wrote: Is there anyone interested to the point where remote login would be helpful? I believe my FreeBSD Wiki page documents what to do if your problem is easily reproducable: contact Scott Long, who has offered to help track down the source of these problems. Changing to 30 second timeout made no difference whatsoever. The problem occurred at about the same time during the single I'm at a standstill. I'm willing to help provide any information necessary to fix this issue, or provide remote access to the box in question. scottl@ has been Cc:'d. Thanks all, Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
:... : and see if the problem reoccurs with just two drives. : :... I knew that was going to come up... my response is I worked so hard :to get this system with ZFS all configured *exactly* how I wanted it. : :To test, I'm going to flip to 30 as per Matthews recommendation, and see :how far that takes me. At this time, I'm only testing by backing up one :machine on the network. If it fails, I'll clock the time, and then :'reformat' with two drives. : :Is there a technical reason this may work better with only two drives? : :Is there anyone interested to the point where remote login would be helpful? : :Steve This issue is vexing a lot of people. Setting the timeout to 30 will not effect performance, but it will cause a 30 second delay in recovery when (if) the problem occurs. i.e. when the disk stalls it will just sit there doing nothing for 30 seconds, then it will print the timeout message and try to recover. It occurs to me that it might be beneficial to actually measure the disk's response time to each request, and then graph it over a period of time. Maybe seeing the issue visually will give some clue as to the actual cause. -Matt Matthew Dillon [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Andrew Snow wrote: From Western Digital's line of enterprise drives: RAID-specific time-limited error recovery (TLER) - Pioneered by WD, this feature prevents drive fallout caused by the extended hard drive error-recovery processes common to desktop drives. Therefore I think the FreeBSD timeout should also be set to 8 seconds instead of 5 seconds. Desktop-targetted drives will not respond for over 10 seconds, up to minutes, so its not worth setting the FreeBSD timeout any higher. Interesting you say this. To reiterate, I have /boot on USB thumb drive, and the system is mounted from / on a raidz pool called /storage via loader.conf. The four drives in question (per the packaging) are: - Western Digital Caviar SE16 500GB - 7200, 16MB, SATA-300, OEM Per the packaging on the rest of the hardware: # mobo - XFX 610i, 7050 GeForce (I *never* use graphics on my FreeBSD boxen, I *only* know/have CLI with no 'windows') # memory - 2 GB Corsair XMS2 Twin2X 6400C4 memory # cpu - Intel Pentium DC E2200 2.20GHz OEM - 2.20 GHz, 1MB Cache, 800MHz FSB, Allendale, Dual Core, OEM, Socket 775, Processor # swap - I don't run any, but can/will add in an IDE/ATA 7200 200GB in the event this problem may be related to ZFS/RAM issues. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: taskqueue timeout
Matthew Dillon wrote: This issue is vexing a lot of people. Heh... I can appreciate this. I would like someone to inform me that this can't be guaranteed to be a ZFS problem... if I can get confirmation that others have this issue aside from ZFS, I would feel content. Setting the timeout to 30 will not effect performance, but it will cause a 30 second delay in recovery when (if) the problem occurs. i.e. when the disk stalls it will just sit there doing nothing for 30 seconds, then it will print the timeout message and try to recover. If I have the timeout at = 30 and the issue still occurs, the problem must be elsewhere. It occurs to me that it might be beneficial to actually measure the disk's response time to each request, and then graph it over a period of time. Maybe seeing the issue visually will give some clue as to the actual cause. I am interested in following through with this, but can't do it on my own. I'm willing to dedicate the box and bandwidth to anyone who can legitimately test this as you state. ie: I need either guidance or assistance. This box is ready for the taking. Beyond this box, I can provide legitimate parties other network resources to produce a consistent flow of data to ensure the ability to easily reproduce the issue locally, on demand. Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
On Mon, May 12, 2008 at 02:42:40PM +0900, dikshie wrote: i just phone computer store and check together the BIOS. it seems my computer store put the disk on IDE mode NOT on AHCI mode. after change to AHCI now FreeBSD can detect SATA 300 (for WDC) and SATA 150 (for DVD-R). Great, there you go. :-) BUT sometimes I still getting: ad4: warning - setfeatures set transfer mode taskqueue timeout -completing request directly ad4: warning - setfeatures enable rcache taskqueue timeout - completing request directly ad4: warning - setfeatures enable wcache taskqueue timeout - completing request directly ad4: timeout - flushcache retrying (0 retries left) geom_journal: flush cache of ad4s1d: error=5 ad4: timeout - write_dma retrying (1 retry left) LBA=4139103 1) Have you checked SMART statistics of the drive, or run SMART tests? Install ports/sysutils/smartmontools and use smartctl -a /dev/ad4, and provide the output. 2) Is the error always on ad4? If so, is the error always at LBA 4139103, or around there (give or take a few thousand addressing blocks)? If so, the ad4 disk may be going bad. Otherwise, I would say this is probably the issue I've documented on my Common Issues page. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
On Mon, May 12, 2008 at 3:00 PM, Jeremy Chadwick [EMAIL PROTECTED] wrote: 1) Have you checked SMART statistics of the drive, or run SMART tests? Install ports/sysutils/smartmontools and use smartctl -a /dev/ad4, and provide the output. 2) Is the error always on ad4? If so, is the error always at LBA 4139103, or around there (give or take a few thousand addressing blocks)? If so, the ad4 disk may be going bad. Otherwise, I would say this is probably the issue I've documented on my Common Issues page. dhcp-143-221# smartctl -a /dev/ad4 smartctl version 5.38 [amd64-portbld-freebsd7.0] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Second Generation Serial ATA family Device Model: WDC WD3200AAKS-00VYA0 Serial Number:WD-WCARW2314765 Firmware Version: 12.01B02 User Capacity:320,072,933,376 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Mon May 12 15:27:24 2008 JST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (8760) seconds. Offline data collection capabilities:(0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 104) minutes. Conveyance self-test routine recommended polling time:( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051Pre-fail Always - 0 3 Spin_Up_Time0x0003 155 155 021Pre-fail Always - 5233 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 11 5 Reallocated_Sector_Ct 0x0033 200 200 140Pre-fail Always - 0 7 Seek_Error_Rate 0x000e 200 200 051Old_age Always - 0 9 Power_On_Hours 0x0032 099 099 000Old_age Always - 940 10 Spin_Retry_Count0x0012 100 253 051Old_age Always - 0 11 Calibration_Retry_Count 0x0012 100 253 051Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 11 192 Power-Off_Retract_Count 0x0032 200 200 000Old_age Always - 8 193 Load_Cycle_Count0x0032 200 200 000Old_age Always - 11 194 Temperature_Celsius 0x0022 114 108 000Old_age Always - 33 196 Reallocated_Event_Count 0x0032 200 200 000Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000Old_age Offline - 0 199 UDMA_CRC_Error_Count0x003e 200 200 000Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 051Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure
any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
Hi, I got : ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly ad4: warning - setfeatures enable rcache taskqueue timeout - completing request directly ad4: warning - setfeatures enable wcache taskqueue timeout - completing request directly ad4: timeout - flushcache retrying (0 retries left) geom_journal: flush cache of ad4s1d: error=5 ad4: timeout - write_dma retrying (1 retry left) LBA=4139103 --- I read: http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues and strange output from dmesg: ad4: 305245MB WDC WD3200AAKS-00VYA0 12.01B02 at ata2-master UDMA33 ^^^ any available patches ? best regards, -dikshie- Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #19: Sat May 10 15:41:00 JST 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/BARU Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz (2333.34-MHz K8-class CPU) Origin = GenuineIntel Id = 0x6fb Stepping = 11 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0xe3fdSSE3,RSVD2,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM AMD Features=0x20100800SYSCALL,NX,LM AMD Features2=0x1LAHF Cores per package: 2 usable memory = 2000646144 (1907 MB) avail memory = 1926553600 (1837 MB) ACPI APIC Table: Nvidia NVDAACPI FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ACPI Warning (tbfadt-0505): Optional field Pm2ControlBlock has zero address or length:0 0/1 [20070320] ioapic0: Changing APIC ID to 4 ioapic0 Version 1.1 irqs 0-23 on motherboard kbd1 at kbdmux0 cryptosoft0: software crypto on motherboard acpi0: Nvidia NVDAACPI on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 10, 7fdf (3) failed acpi0: reservation of 0, a (3) failed Timecounter ACPI-safe frequency 3579545 Hz quality 850 acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0 acpi_hpet0: High Precision Event Timer iomem 0xfeff-0xfeff03ff on acpi0 Timecounter HPET frequency 2500 Hz quality 900 cpu0: ACPI CPU on acpi0 est0: Enhanced SpeedStep Frequency Control on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 72a072a0600072a device_attach: est0 attach returned 6 p4tcc0: CPU Frequency Thermal Control on cpu0 cpu1: ACPI CPU on acpi0 est1: Enhanced SpeedStep Frequency Control on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 72a072a0600072a device_attach: est1 attach returned 6 p4tcc1: CPU Frequency Thermal Control on cpu1 acpi_button0: Power Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pci0: memory, RAM at device 0.1 (no driver attached) pci0: memory, RAM at device 1.0 (no driver attached) pci0: memory, RAM at device 1.1 (no driver attached) pci0: memory, RAM at device 1.2 (no driver attached) pci0: memory, RAM at device 1.3 (no driver attached) pci0: memory, RAM at device 1.4 (no driver attached) pci0: memory, RAM at device 1.5 (no driver attached) pci0: memory, RAM at device 1.6 (no driver attached) pci0: memory, RAM at device 2.0 (no driver attached) isab0: PCI-ISA bridge at device 3.0 on pci0 isa0: ISA bus on isab0 pci0: serial bus, SMBus at device 3.1 (no driver attached) pci0: memory, RAM at device 3.2 (no driver attached) pci0: memory, RAM at device 3.4 (no driver attached) ohci0: OHCI (generic) USB controller mem 0xe000-0xefff irq 21 at device 4.0 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: OHCI (generic) USB controller on ohci0 usb0: USB revision 1.0 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 on usb0 uhub0: 10 ports with 10 removable, self powered ehci0: EHCI (generic) USB 2.0 controller mem 0xefffe000-0xefffe0ff irq 22 at device 4.1 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb1: EHCI version 1.0 usb1: companion controller, 10 ports each: usb0 usb1: EHCI (generic) USB 2.0 controller on ehci0 usb1: USB revision 2.0 uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 on usb1 uhub1: 10 ports with 10 removable, self powered umass0: Generic USB2.0-CRW, class 0/0, rev 2.00/11.22, addr 2 on uhub1 atapci0: nVidia nForce MCP73 UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 8.0 on pci0 ata0: ATA channel 0 on atapci0 ata0: [ITHREAD] ata1: ATA channel 1 on atapci0 ata1: [ITHREAD] pci0: multimedia
Re: any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
On Mon, May 12, 2008 at 01:17:16PM +0900, dikshie wrote: and strange output from dmesg: ad4: 305245MB WDC WD3200AAKS-00VYA0 12.01B02 at ata2-master UDMA33 ^^^ any available patches ? What's strange about this? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
dikshie wrote: atapci1: nVidia ATA controller port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xf700-0xf70f mem 0xefff8000-0xefff9fff irq 20 at device 14.0 on pci0 It seems your controller detected as generic ATA. Can you show `pciconf -l` output from your system? -- WBR, Andrey V. Elsukov ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: any available patches for sata: ad4: warning - setfeatures set transfer mode taskqueue timeout - completing request directly
On Mon, May 12, 2008 at 02:14:37PM +0900, dikshie wrote: On Mon, May 12, 2008 at 2:11 PM, Jeremy Chadwick [EMAIL PROTECTED] wrote: On Mon, May 12, 2008 at 01:17:16PM +0900, dikshie wrote: and strange output from dmesg: ad4: 305245MB WDC WD3200AAKS-00VYA0 12.01B02 at ata2-master UDMA33 ^^^ any available patches ? What's strange about this? i mean for UDMA33 it should be SATA300 do have to upgrade BIOS? Your carets are pointing to the drive firmware revision, which is why I was confused. :-) Yes, it should say either SATA150 or SATA300 (more on that in a moment), but based on your dmesg output, it appears your SATA controller does not have an attached driver, thus is operating generically. Andrey recommended showing pciconf -lv output; please do. Your drive is *probably* operating in SATA150/300 mode, despite UDMA33 being printed, however. Assuming you know your disk can push 33MB/sec, you could try some simple read I/O and use gstat to watch the speed. dd if=/dev/ad4 of=/dev/null bs=1m should suffice. Regarding SATA150 vs. SATA300: your drive has a physical jumper, labelled OPT1 in the below photo, which limits the drive to SATA150 capability. You can remove this jumper and get SATA300. This doesn't explain the UDMA33 issue, though. http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1400p_created=1134597011 Also, please don't remove the mailing list from the CC; others need to know what information you've provided, and future users may find this thread useful. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Promise PDC20378 - SETFEATURES SET TRANSFER MODE taskqueue timeout
Everyone, I just recently updated my primary server to the latest FreeBSD RELENG_6 release last weekend and have started receiving the following errors every day requiring me to power off the computer (the console is hung and Ctrl-Alt-Del don't work): Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad6: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=1129375 Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=1129375 For some strange reason the above error last night didn't cause the typical hang, but this evening it happend again @ ~9:30pm (17:30) with a hard failure resulting in a power-cycle to get the server running again. I originally thought one of the drives was going bad so I replaced the existing 200gb Maxstor PATA with two 500gb WD SATA. Therefore, that rules out cables and drives and returns me to the motherboard (Promise Controller) or FreeBSD. The following is the output from a pciconf -lv [EMAIL PROTECTED]:0:0: class=0x06 card=0x80f61043 chip=0x25788086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82875P/E7210 DRAM Controller / Host-Hub Interface' class = bridge subclass = HOST-PCI [EMAIL PROTECTED]:1:0: class=0x060400 card=0x chip=0x25798086 rev=0x02 hdr=0x01 vendor = 'Intel Corporation' device = '82875P PCI-to-AGP Bridge' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:29:0:class=0x0c0300 card=0x80a61043 chip=0x24d28086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) USB UHCI Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:29:1:class=0x0c0300 card=0x80a61043 chip=0x24d48086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) USB UHCI Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:29:2:class=0x0c0300 card=0x80a61043 chip=0x24d78086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) USB UHCI Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:29:3:class=0x0c0300 card=0x80a61043 chip=0x24de8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) USB UHCI Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:29:7:class=0x0c0320 card=0x80a61043 chip=0x24dd8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) USB 2.0 EHCI Controller' class = serial bus subclass = USB [EMAIL PROTECTED]:30:0:class=0x060400 card=0x chip=0x244e8086 rev=0xc2 hdr=0x01 vendor = 'Intel Corporation' device = '82801BA/CA/DB/DBL/EB/ER/FB (ICH2/3/4/4/5/5/6), 6300ESB Hub Interface to PCI Bridge' class = bridge subclass = PCI-PCI [EMAIL PROTECTED]:31:0:class=0x060100 card=0x chip=0x24d08086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) LPC Interface Bridge' class = bridge subclass = PCI-ISA [EMAIL PROTECTED]:31:1: class=0x01018a card=0x80a61043 chip=0x24db8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) EIDE Controller' class = mass storage subclass = ATA [EMAIL PROTECTED]:31:3:class=0x0c0500 card=0x80a61043 chip=0x24d38086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801EB/ER (ICH5/ICH5R) SMBus Controller' class = serial bus subclass = SMBus [EMAIL PROTECTED]:31:5:class=0x040100 card=0x80f31043 chip=0x24d58086 rev=0x02 hdr
Re: Promise PDC20378 - SETFEATURES SET TRANSFER MODE taskqueue timeout
Jeff Doolittle wrote: Everyone, I just recently updated my primary server to the latest FreeBSD RELENG_6 release last weekend and have started receiving the following errors every day requiring me to power off the computer (the console is hung and Ctrl-Alt-Del don't work): Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad6: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=1129375 Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request dire ctly Oct 18 23:15:02 saturn kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Oct 18 23:15:02 saturn kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=1129375 [...] I had same problem many times and only mainboard replacement solves the problem. Last time I saw these errors (1 week ago) it was in dying Asus RS-120 which was running 6.2-RELEASE for about 6 month. So the problem is not related to 6.2-RELEASE, but to hardware. Miroslav Lachman ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - on FreeBSD 6-STABLE
Have you tried it with a livecd or something? +++ Peter van Heusden [freebsd] [24/03/06 09:51 +0200]: Hi After my previous email about the SETFEATURES SET TRANSFER MODE timeout on (msgid [EMAIL PROTECTED] , 17 March 14:18 GMT + 2 on freebsd-stable), I installed FreeBSD 6.1 BETA 4 and upgraded to a 6-STABLE kernel, running the box in 'safe' mode to do so. I now, however, get a slightly different error message: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: FAILURE - WRITE_DMA timed out LBA=32804495 (The address after LBA is not always the same) This is with ad4 as a Seagate ST320423A on a Promise PDC20262 UDMA66 controller. Any suggestions? Thanks, Peter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- b1tt3r -- You know, like sugar? Sam Stein Computer TeXnician/Programmer pgphwwtCAfkE4.pgp Description: PGP signature
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - on FreeBSD 6-STABLE
Hi After my previous email about the SETFEATURES SET TRANSFER MODE timeout on (msgid [EMAIL PROTECTED] , 17 March 14:18 GMT + 2 on freebsd-stable), I installed FreeBSD 6.1 BETA 4 and upgraded to a 6-STABLE kernel, running the box in 'safe' mode to do so. I now, however, get a slightly different error message: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly ad4: FAILURE - WRITE_DMA timed out LBA=32804495 (The address after LBA is not always the same) This is with ad4 as a Seagate ST320423A on a Promise PDC20262 UDMA66 controller. Any suggestions? Thanks, Peter ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]