Re: zfs i/o hangs on 9-PRERELEASE
Hi all, Just wanted to report back that I found time to do more diagnostics. ZFS/FreeBSD/etc are not to blame. ZFS / FreeBSD never reported any I/O errors and scrubs always came up clean because one of the disks was failing and had issues reading the disk but would eventually return accurate data. It was sort of like having a RAIDZ with one disk that had 30s latency sometimes! Seatools confirmed this disk was failing and after replacing the disk all issues went away. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
Pawel Jakub Dawidek wrote: > On Fri, Nov 25, 2011 at 01:20:01PM -0600, Mark Felder wrote: > > 13:14:32 nas:~ > uname -a > > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 > > r227971M: > > Fri Nov 25 10:07:48 CST 2011 > > r...@nas.feld.me:/usr/obj/tank/svn/sys/GENERIC amd64 > > > > This seemed to start happening sometime after RC1. I tried 8-STABLE > > and > > it's happening there too right now. I think whatever caused this was > > MFC'd. I've also reproduced this on completely different hardware > > running a single disk ZFS pool. > > > > > > I'm getting this output in dmesg after these hangs I keep seeing. > > Mark, those backtrace are not related to ZFS, but to PF. Not sure if > they are at all related to your hangs. Most cases where ZFS I/O seems > to > hang are hardware problems, where I/O requests are not completed. > He recently posted that his hangs went away when he stopped using NFS. NFS does use uma_zalloc() and there are several places in pfioctl() where uma_zalloc(...M_WAITOK...) is called (hidden under pool_get()) when a mutex (the PF_LOCK() one) is held. I've emailed bz@ related to this. I'm also not sure if they could be related to his hangs, but it seems that if uma_zalloc() decides to sleep with the mutex held, something may break and a broken uma_zalloc() would impact NFS. rick > 'procstat -kk -a' output might be useful once the hang happens. > > > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable > > locks > > held: > > exclusive sleep mutex pf task mtx (pf task mtx) r = 0 > > (0x8199af20) locked @ > > /tank/svn/sys/modules/pf/../../contrib/pf/net/pf_ioctl.c:1589 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > kdb_backtrace() at kdb_backtrace+0x37 > > _witness_debugger() at _witness_debugger+0x2e > > witness_warn() at witness_warn+0x2c4 > > uma_zalloc_arg() at uma_zalloc_arg+0x335 > > pfr_create_ktable() at pfr_create_ktable+0xd8 > > pfr_ina_define() at pfr_ina_define+0x12b > > pfioctl() at pfioctl+0x1c5a > > devfs_ioctl_f() at devfs_ioctl_f+0x7a > > kern_ioctl() at kern_ioctl+0xcd > > sys_ioctl() at sys_ioctl+0xfd > > amd64_syscall() at amd64_syscall+0x3ac > > Xfast_syscall() at Xfast_syscall+0xf7 > > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800da711c, rsp = > > 0x7fff9d28, rbp = 0x7fffa1f0 --- > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
On Fri, Nov 25, 2011 at 01:20:01PM -0600, Mark Felder wrote: > 13:14:32 nas:~ > uname -a > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 r227971M: > Fri Nov 25 10:07:48 CST 2011 > r...@nas.feld.me:/usr/obj/tank/svn/sys/GENERIC amd64 > > This seemed to start happening sometime after RC1. I tried 8-STABLE and > it's happening there too right now. I think whatever caused this was > MFC'd. I've also reproduced this on completely different hardware > running a single disk ZFS pool. > > > I'm getting this output in dmesg after these hangs I keep seeing. Mark, those backtrace are not related to ZFS, but to PF. Not sure if they are at all related to your hangs. Most cases where ZFS I/O seems to hang are hardware problems, where I/O requests are not completed. 'procstat -kk -a' output might be useful once the hang happens. > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable locks > held: > exclusive sleep mutex pf task mtx (pf task mtx) r = 0 > (0x8199af20) locked @ > /tank/svn/sys/modules/pf/../../contrib/pf/net/pf_ioctl.c:1589 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > _witness_debugger() at _witness_debugger+0x2e > witness_warn() at witness_warn+0x2c4 > uma_zalloc_arg() at uma_zalloc_arg+0x335 > pfr_create_ktable() at pfr_create_ktable+0xd8 > pfr_ina_define() at pfr_ina_define+0x12b > pfioctl() at pfioctl+0x1c5a > devfs_ioctl_f() at devfs_ioctl_f+0x7a > kern_ioctl() at kern_ioctl+0xcd > sys_ioctl() at sys_ioctl+0xfd > amd64_syscall() at amd64_syscall+0x3ac > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800da711c, rsp = > 0x7fff9d28, rbp = 0x7fffa1f0 --- -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp98vrUuromO.pgp Description: PGP signature
Re: zfs i/o hangs on 9-PRERELEASE
On 28.11.2011 3:21, Mark Felder wrote: > After many hours of testing, reproducing, and testing again I've > finally been able to narrow down what the real issue is and it's not > ZFS as I suspected. After completely turning off all NFS functionality > and serving my files over Samba I haven't had a single issue. It seems > there is something going on with the new NFS code (I serve out over > v4, but reproduced it last week with v3) and my media player box, a > Popcorn Hour A-200 which is running Linux. If I can cobble some > hardware together and place it between so I can do some tcpdumps I > will provide that data so perhaps someone can understand what's going > on. If this is due to a badly behaving client this is potentially a > DoS on the server. > > > Regards, > > > > Mark Hi Mark, as to the output you have posted this seems to be a pf problem. Could you try the same situation with with pf(4) disabled? If you are not able to reproduce this hang with pf(4) disabled, it would be very nice to have a PR submitted. -- Martin Matuska FreeBSD committer http://blog.vx.sk ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
After many hours of testing, reproducing, and testing again I've finally been able to narrow down what the real issue is and it's not ZFS as I suspected. After completely turning off all NFS functionality and serving my files over Samba I haven't had a single issue. It seems there is something going on with the new NFS code (I serve out over v4, but reproduced it last week with v3) and my media player box, a Popcorn Hour A-200 which is running Linux. If I can cobble some hardware together and place it between so I can do some tcpdumps I will provide that data so perhaps someone can understand what's going on. If this is due to a badly behaving client this is potentially a DoS on the server. Regards, Mark ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
On Sat, Nov 26, 2011 at 04:47:35PM -0600, Mark Felder wrote: > It appears that I'm mistaken about those messages then . However this does > both happen on my AMD x6 and Intel Atom machines with different hard drives, > controllers, etc. I feel it would be unlikely to be hardware. > > Unfortunately the procstat command is probably of no use because I can't > interact with the console or ssh for the periods of time when it is hanging > (sometimes in excess of a minute). Zpool scrubs come up clean and I never see > any errors reported. I've been running this hardware for 2 years and v28 for > quite some time. It doesn't seem like it started happening until I upgraded > to a build past RC1. I don't know where to find RC1 media and I don't know > the svn revision of RC1 so I haven't tried. The kernel backtrace you provided indicates a problem in pf(4), not ZFS. What piece am I missing? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
It appears that I'm mistaken about those messages then . However this does both happen on my AMD x6 and Intel Atom machines with different hard drives, controllers, etc. I feel it would be unlikely to be hardware. Unfortunately the procstat command is probably of no use because I can't interact with the console or ssh for the periods of time when it is hanging (sometimes in excess of a minute). Zpool scrubs come up clean and I never see any errors reported. I've been running this hardware for 2 years and v28 for quite some time. It doesn't seem like it started happening until I upgraded to a build past RC1. I don't know where to find RC1 media and I don't know the svn revision of RC1 so I haven't tried. Regards, Mark ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
on 25/11/2011 21:20 Mark Felder said the following: > 13:14:32 nas:~ > uname -a > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 r227971M: Fri Nov > 25 10:07:48 CST 2011 r...@nas.feld.me:/usr/obj/tank/svn/sys/GENERIC amd64 > > This seemed to start happening sometime after RC1. I tried 8-STABLE and it's > happening there too right now. I think whatever caused this was MFC'd. I've > also > reproduced this on completely different hardware running a single disk ZFS > pool. > > > I'm getting this output in dmesg after these hangs I keep seeing. > > > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable locks held: > exclusive sleep mutex pf task mtx (pf task mtx) r = 0 (0x8199af20) > locked @ /tank/svn/sys/modules/pf/../../contrib/pf/net/pf_ioctl.c:1589 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > _witness_debugger() at _witness_debugger+0x2e > witness_warn() at witness_warn+0x2c4 > uma_zalloc_arg() at uma_zalloc_arg+0x335 > pfr_create_ktable() at pfr_create_ktable+0xd8 > pfr_ina_define() at pfr_ina_define+0x12b > pfioctl() at pfioctl+0x1c5a > devfs_ioctl_f() at devfs_ioctl_f+0x7a > kern_ioctl() at kern_ioctl+0xcd > sys_ioctl() at sys_ioctl+0xfd > amd64_syscall() at amd64_syscall+0x3ac > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800da711c, rsp = > 0x7fff9d28, rbp = 0x7fffa1f0 --- Please note that all these messages are about pf. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
On 25.11.2011 13:39, Freddie Cash wrote: There's a lot of uma_* stuff in there. Just curious, what's the following sysctl set to: vfs.zfs.zio.use_uma Back in the 8.x days, it was recommended to set it to 0 due to bugs: http://lists.freebsd.org/pipermail/freebsd-stable/2010-June/057162.html No idea if this is still the case or not, but you may want to try toggling that sysctl and see if it makes a difference. Confirmed mine is set to 0. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: zfs i/o hangs on 9-PRERELEASE
On Fri, Nov 25, 2011 at 11:20 AM, Mark Felder wrote: > 13:14:32 nas:~ > uname -a > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 r227971M: > Fri Nov 25 10:07:48 CST 2011 > r...@nas.feld.me:/usr/obj/**tank/svn/sys/GENERIC > amd64 > > This seemed to start happening sometime after RC1. I tried 8-STABLE and > it's happening there too right now. I think whatever caused this was MFC'd. > I've also reproduced this on completely different hardware running a single > disk ZFS pool. > > I'm getting this output in dmesg after these hangs I keep seeing. > > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable locks > held: > There's a lot of uma_* stuff in there. Just curious, what's the following sysctl set to: vfs.zfs.zio.use_uma Back in the 8.x days, it was recommended to set it to 0 due to bugs: http://lists.freebsd.org/pipermail/freebsd-stable/2010-June/057162.html No idea if this is still the case or not, but you may want to try toggling that sysctl and see if it makes a difference. -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"