Re: ZFS panic with concurrent recv and read-heavy workload
On Fri, Jun 03, 2011 at 03:03:56AM -0400, Nathaniel W Filardo wrote: I just got this on another machine, no heavy workload needed, just booting and starting some jails. Of interest, perhaps, both this and the machine triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will confess that the machine in the original report may have had bad RAM). I have run a UP 1.2GHz V240 for months and never seen this panic. This time the kernel is FreeBSD 9.0-CURRENT #9: Fri Jun 3 02:32:13 EDT 2011 csup'd immediately before building. The full panic this time is panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659 cpuid = 1 KDB: stack backtrace: panic() at panic+0x1c8 _sx_assert() at _sx_assert+0xc4 _sx_xunlock() at _sx_xunlock+0x98 l2arc_feed_thread() at l2arc_feed_thread+0xeac fork_exit() at fork_exit+0x9c fork_trampoline() at fork_trampoline+0x8 SC Alert: SC Request to send Break to host. KDB: enter: Line break on console [ thread pid 27 tid 100121 ] Stopped at kdb_enter+0x80: ta %xcc, 1 db reset ttiimmeeoouutt sshhuuiinngg ddoowwnn CCPPUUss.. Half of the memory in this machine is new (well, came with the machine) and half is from the aforementioned UP V240 which seemed to work fine (I was attempting an upgrade when this happened); none of it (or indeed any of the hardware save the disk controller and disks) are common between this and the machine reporting below. Thoughts? Any help would be greatly appreciated. Thanks. --nwf; On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote: [...] panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869 cpuid = 1 KDB: stack backtrace: panic() at panic+0x1c8 _sx_assert() at _sx_assert+0xc4 _sx_xunlock() at _sx_xunlock+0x98 arc_evict() at arc_evict+0x614 arc_get_data_buf() at arc_get_data_buf+0x360 arc_buf_alloc() at arc_buf_alloc+0x94 dmu_buf_will_fill() at dmu_buf_will_fill+0xfc dmu_write() at dmu_write+0xec dmu_recv_stream() at dmu_recv_stream+0x8a8 zfs_ioc_recv() at zfs_ioc_recv+0x354 zfsdev_ioctl() at zfsdev_ioctl+0xe0 devfs_ioctl_f() at devfs_ioctl_f+0xe8 kern_ioctl() at kern_ioctl+0x294 ioctl() at ioctl+0x198 syscallenter() at syscallenter+0x270 syscall() at syscall+0x74 -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 -- userland() at 0x40e72cc8 user trace: trap %o7=0x40c13e24 pc 0x40e72cc8, sp 0x7fd4641 pc 0x40c158f4, sp 0x7fd4721 pc 0x40c1e878, sp 0x7fd47f1 pc 0x40c1ce54, sp 0x7fd8b01 pc 0x40c1dbe0, sp 0x7fd9431 pc 0x40c1f718, sp 0x7fdd741 pc 0x10731c, sp 0x7fdd831 pc 0x10c90c, sp 0x7fdd8f1 pc 0x103ef0, sp 0x7fde1d1 pc 0x4021aff4, sp 0x7fde291 done [...] Apparently this is a locking issue in the ARC code, the ZFS people should be able to help you. Marius ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ZFS panic with concurrent recv and read-heavy workload
I just got this on another machine, no heavy workload needed, just booting and starting some jails. Of interest, perhaps, both this and the machine triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will confess that the machine in the original report may have had bad RAM). I have run a UP 1.2GHz V240 for months and never seen this panic. This time the kernel is FreeBSD 9.0-CURRENT #9: Fri Jun 3 02:32:13 EDT 2011 csup'd immediately before building. The full panic this time is panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659 cpuid = 1 KDB: stack backtrace: panic() at panic+0x1c8 _sx_assert() at _sx_assert+0xc4 _sx_xunlock() at _sx_xunlock+0x98 l2arc_feed_thread() at l2arc_feed_thread+0xeac fork_exit() at fork_exit+0x9c fork_trampoline() at fork_trampoline+0x8 SC Alert: SC Request to send Break to host. KDB: enter: Line break on console [ thread pid 27 tid 100121 ] Stopped at kdb_enter+0x80: ta %xcc, 1 db reset ttiimmeeoouutt sshhuuiinngg ddoowwnn CCPPUUss.. Half of the memory in this machine is new (well, came with the machine) and half is from the aforementioned UP V240 which seemed to work fine (I was attempting an upgrade when this happened); none of it (or indeed any of the hardware save the disk controller and disks) are common between this and the machine reporting below. Thoughts? Any help would be greatly appreciated. Thanks. --nwf; On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote: [...] panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869 cpuid = 1 KDB: stack backtrace: panic() at panic+0x1c8 _sx_assert() at _sx_assert+0xc4 _sx_xunlock() at _sx_xunlock+0x98 arc_evict() at arc_evict+0x614 arc_get_data_buf() at arc_get_data_buf+0x360 arc_buf_alloc() at arc_buf_alloc+0x94 dmu_buf_will_fill() at dmu_buf_will_fill+0xfc dmu_write() at dmu_write+0xec dmu_recv_stream() at dmu_recv_stream+0x8a8 zfs_ioc_recv() at zfs_ioc_recv+0x354 zfsdev_ioctl() at zfsdev_ioctl+0xe0 devfs_ioctl_f() at devfs_ioctl_f+0xe8 kern_ioctl() at kern_ioctl+0x294 ioctl() at ioctl+0x198 syscallenter() at syscallenter+0x270 syscall() at syscall+0x74 -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 -- userland() at 0x40e72cc8 user trace: trap %o7=0x40c13e24 pc 0x40e72cc8, sp 0x7fd4641 pc 0x40c158f4, sp 0x7fd4721 pc 0x40c1e878, sp 0x7fd47f1 pc 0x40c1ce54, sp 0x7fd8b01 pc 0x40c1dbe0, sp 0x7fd9431 pc 0x40c1f718, sp 0x7fdd741 pc 0x10731c, sp 0x7fdd831 pc 0x10c90c, sp 0x7fdd8f1 pc 0x103ef0, sp 0x7fde1d1 pc 0x4021aff4, sp 0x7fde291 done [...] pgpz83vKmukl9.pgp Description: PGP signature
ZFS panic with concurrent recv and read-heavy workload
When racing two workloads, one doing zfs recv -v -d testpool and the other find /testpool -type f -print0 | xargs -0 sha1 I can (seemingly reliably) trigger this panic: panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869 cpuid = 1 KDB: stack backtrace: panic() at panic+0x1c8 _sx_assert() at _sx_assert+0xc4 _sx_xunlock() at _sx_xunlock+0x98 arc_evict() at arc_evict+0x614 arc_get_data_buf() at arc_get_data_buf+0x360 arc_buf_alloc() at arc_buf_alloc+0x94 dmu_buf_will_fill() at dmu_buf_will_fill+0xfc dmu_write() at dmu_write+0xec dmu_recv_stream() at dmu_recv_stream+0x8a8 zfs_ioc_recv() at zfs_ioc_recv+0x354 zfsdev_ioctl() at zfsdev_ioctl+0xe0 devfs_ioctl_f() at devfs_ioctl_f+0xe8 kern_ioctl() at kern_ioctl+0x294 ioctl() at ioctl+0x198 syscallenter() at syscallenter+0x270 syscall() at syscall+0x74 -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 -- userland() at 0x40e72cc8 user trace: trap %o7=0x40c13e24 pc 0x40e72cc8, sp 0x7fd4641 pc 0x40c158f4, sp 0x7fd4721 pc 0x40c1e878, sp 0x7fd47f1 pc 0x40c1ce54, sp 0x7fd8b01 pc 0x40c1dbe0, sp 0x7fd9431 pc 0x40c1f718, sp 0x7fdd741 pc 0x10731c, sp 0x7fdd831 pc 0x10c90c, sp 0x7fdd8f1 pc 0x103ef0, sp 0x7fde1d1 pc 0x4021aff4, sp 0x7fde291 done The machine is a freshly installed and built sparc64 2-way SMP, running today's -CURRENT with http://people.freebsd.org/~mm/patches/zfs/zfs_ioctl_compat_bugfix.patch applied. Of note, it has only 1G of RAM in it, so kmem_max = 512M. Thoughts? More information? Thanks in advance. --nwf; pgpo8tXy31jgF.pgp Description: PGP signature