Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-04 Thread Nick Rogers
On Mon, Mar 4, 2019 at 5:29 PM Andriy Gapon  wrote:

> On 04/03/2019 22:35, Nick Rogers wrote:
> > v_lock = {lock_object = {lo_name =
> > 0x8144af45 "zfs", lo_flags = 117112840, lo_data = 0, lo_witness =
> > 0x0}, lk_lock = 18446744073709551605, lk_exslpfail = 0, lk_timo = 51,
> > lk_pri = 96}
>
> Hmm, lk_lock looks bogus.
> 18446744073709551605 == 0xfff5 and it's LK_SHARE |
> LK_EXCLUSIVE_WAITERS with 0xfff shared owners.
> Perhaps, this is a result of LK_SHARERS_LOCK(-1).
>
> Is your kernel compiled with INVARIANTS and INVARIANT_SUPPORT?
> I suspect that the vnode was accessed (unlocked?) through a stale pointer
> after
> it was recycled.
>

I don't believe so - it's basically amd64 GENERIC w/ a reduced set of
modules and static zfs option.


> --
> Andriy Gapon
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-04 Thread Andriy Gapon
On 04/03/2019 22:35, Nick Rogers wrote:
> v_lock = {lock_object = {lo_name =
> 0x8144af45 "zfs", lo_flags = 117112840, lo_data = 0, lo_witness =
> 0x0}, lk_lock = 18446744073709551605, lk_exslpfail = 0, lk_timo = 51,
> lk_pri = 96}

Hmm, lk_lock looks bogus.
18446744073709551605 == 0xfff5 and it's LK_SHARE |
LK_EXCLUSIVE_WAITERS with 0xfff shared owners.
Perhaps, this is a result of LK_SHARERS_LOCK(-1).

Is your kernel compiled with INVARIANTS and INVARIANT_SUPPORT?
I suspect that the vnode was accessed (unlocked?) through a stale pointer after
it was recycled.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-04 Thread Nick Rogers
On Sat, Mar 2, 2019 at 12:48 PM Andriy Gapon  wrote:

> On 01/03/2019 17:00, Nick Rogers wrote:
> > 36704 101146 perl-   mi_switch+0xe1
> > sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_xlock_hard+0x19c
> VOP_LOCK1_APV+0x7e
> > _vn_lock+0x40 zfs_znode_alloc+0x434 zfs_mknode+0xa9d
> > zfs_freebsd_create+0x512 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
> > kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101
>
> I suspect that this thread is a root cause of the problem.
> In this place, the vnode should be freshly created and not visible to
> anything
> but the current thread.  So, vn_lock() should always immediately succeed.
> I
> cannot understand how the vnode lock could be held by another thread.
>

It happened again. I tried to get a backtrace from the offending thread and
one of the others waiting for it. At the moment I have access to this
particular system in its bad state and can leave it like this for as long
as possible, so let me know if there's something else useful I can get out
of the debugger.

courtland# procstat -kka | grep zfs
0 100140 kernel  zfsvfs  mi_switch+0xe1
sleepq_wait+0x2c _sleep+0x237 taskqueue_thread_loop+0xf1 fork_exit+0x83
fork_trampoline+0xe
0 100424 kernel  zfs_vn_rele_taskq   mi_switch+0xe1
sleepq_wait+0x2c _sleep+0x237 taskqueue_thread_loop+0xf1 fork_exit+0x83
fork_trampoline+0xe
   23 100119 zfskern arc_reclaim_thread  mi_switch+0xe1
sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a arc_reclaim_thread+0x146
fork_exit+0x83 fork_trampoline+0xe
   23 100120 zfskern arc_dnlc_evicts_thr mi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 arc_dnlc_evicts_thread+0x16f fork_exit+0x83
fork_trampoline+0xe
   23 100122 zfskern dbuf_evict_thread   mi_switch+0xe1
sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a dbuf_evict_thread+0x1c8
fork_exit+0x83 fork_trampoline+0xe
   23 100139 zfskern l2arc_feed_thread   mi_switch+0xe1
sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a l2arc_feed_thread+0x219
fork_exit+0x83 fork_trampoline+0xe
   23 100405 zfskern trim zroot  mi_switch+0xe1
sleepq_timedwait+0x2f _cv_timedwait_sbt+0x17a trim_thread+0x11f
fork_exit+0x83 fork_trampoline+0xe
   23 100441 zfskern txg_thread_entermi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 txg_quiesce+0x21b txg_quiesce_thread+0x11b
fork_exit+0x83 fork_trampoline+0xe
   23 100442 zfskern txg_thread_entermi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 txg_sync_thread+0x13b fork_exit+0x83
fork_trampoline+0xe
   23 100443 zfskern solthread 0xfff mi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0xcc fork_exit+0x83
fork_trampoline+0xe
   23 100444 zfskern solthread 0xfff mi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 zthr_procedure+0xcc fork_exit+0x83
fork_trampoline+0xe
 7476 100751 postgres-   mi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 dmu_tx_wait+0x2eb dmu_tx_assign+0x48
zfs_freebsd_create+0x4c8 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101
 7480 100527 postgres-   mi_switch+0xe1
sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_xlock_hard+0x19c VOP_LOCK1_APV+0x7e
_vn_lock+0x40 zfs_znode_alloc+0x434 zfs_mknode+0xa9d
zfs_freebsd_create+0x512 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101
46101 100471 postgres-   mi_switch+0xe1
sleepq_wait+0x2c _cv_wait+0x152 dmu_tx_wait+0x2eb dmu_tx_assign+0x48
zfs_freebsd_create+0x4c8 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101
52625 100488 perl-   mi_switch+0xe1
sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_slock_hard+0x2c5 VOP_LOCK1_APV+0x7e
_vn_lock+0x40 zfs_root+0x6d lookup+0x933 namei+0x44b kern_statat+0x77
sys_fstatat+0x2f amd64_syscall+0x369 fast_syscall_common+0x101
52675 100643 csh -   mi_switch+0xe1
sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_slock_hard+0x2c5 VOP_LOCK1_APV+0x7e
_vn_lock+0x40 zfs_root+0x6d lookup+0x933 namei+0x44b kern_statat+0x77
sys_fstatat+0x2f amd64_syscall+0x369 fast_syscall_common+0x101
52826 100562 ls  -   mi_switch+0xe1
sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_slock_hard+0x2c5 VOP_LOCK1_APV+0x7e
_vn_lock+0x40 zfs_root+0x6d lookup+0x933 namei+0x44b kern_statat+0x77
sys_fstatat+0x2f amd64_syscall+0x369 fast_syscall_common+0x101
52889 100641 bash-   mi_switch+0xe1
sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_slock_hard+0x2c5 VOP_LOCK1_APV+0x7e
_vn_lock+0x40 zfs_root+0x6d lookup+0x933 namei+0x44b kern_statat+0x77
sys_fstatat+0x2f amd64_syscall+0x369 fast_syscall_common+0x101
courtland# kgdb
GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU 

Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-04 Thread Nick Rogers
Thanks for the insight, it does appear that in all instances of this
problem there is always one thread stuck on zfs_znode_alloc. Unfortunately
its always a different application (e.g., perl, sh, postgres). I will post
more information in the bug.

On Sat, Mar 2, 2019 at 12:48 PM Andriy Gapon  wrote:

> On 01/03/2019 17:00, Nick Rogers wrote:
> > 36704 101146 perl-   mi_switch+0xe1
> > sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_xlock_hard+0x19c
> VOP_LOCK1_APV+0x7e
> > _vn_lock+0x40 zfs_znode_alloc+0x434 zfs_mknode+0xa9d
> > zfs_freebsd_create+0x512 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
> > kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101
>
> I suspect that this thread is a root cause of the problem.
> In this place, the vnode should be freshly created and not visible to
> anything
> but the current thread.  So, vn_lock() should always immediately succeed.
> I
> cannot understand how the vnode lock could be held by another thread.
>
> --
> Andriy Gapon
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-04 Thread Nick Rogers
On Sat, Mar 2, 2019 at 5:27 PM Peter Avalos via freebsd-stable <
freebsd-stable@freebsd.org> wrote:

>
> > On Mar 1, 2019, at 7:00 AM, Nick Rogers  wrote:
> >
> > I am hoping someone can help me figure out if this is a legitimate bug,
> or
> > something already fixed in 12-STABLE. I wish I could reproduce it
> reliably
> > to try against STABLE, but there doesn't appear to be any related ZFS
> fixes
> > not in RELEASE. Thanks.
> >
>
> I have also experienced this problem, but I haven’t been able to
> troubleshoot it at all.
>

I've opened a bug report, so if you have any more information about how it
is affecting you that may be helpful to share here.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236220


>
> Peter
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-02 Thread Peter Avalos via freebsd-stable

> On Mar 1, 2019, at 7:00 AM, Nick Rogers  wrote:
> 
> I am hoping someone can help me figure out if this is a legitimate bug, or
> something already fixed in 12-STABLE. I wish I could reproduce it reliably
> to try against STABLE, but there doesn't appear to be any related ZFS fixes
> not in RELEASE. Thanks.
> 

I have also experienced this problem, but I haven’t been able to troubleshoot 
it at all.

Peter
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 12.0-RELEASE zfs/vnode deadlock issue

2019-03-02 Thread Andriy Gapon
On 01/03/2019 17:00, Nick Rogers wrote:
> 36704 101146 perl-   mi_switch+0xe1
> sleepq_wait+0x2c sleeplk+0x1c5 lockmgr_xlock_hard+0x19c VOP_LOCK1_APV+0x7e
> _vn_lock+0x40 zfs_znode_alloc+0x434 zfs_mknode+0xa9d
> zfs_freebsd_create+0x512 VOP_CREATE_APV+0x78 vn_open_cred+0x2c9
> kern_openat+0x20c amd64_syscall+0x369 fast_syscall_common+0x101

I suspect that this thread is a root cause of the problem.
In this place, the vnode should be freshly created and not visible to anything
but the current thread.  So, vn_lock() should always immediately succeed.  I
cannot understand how the vnode lock could be held by another thread.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


12.0-RELEASE zfs/vnode deadlock issue

2019-03-01 Thread Nick Rogers
Recently a number of my production 12.0 systems have experienced what I can
only gather is a ZFS deadlock related to vnodes. It seems similar to the
relatively recent FreeBSD-EN-18:18.zfs (ZFS vnode reclaim deadlock)
problem. Previously the same systems were running 11.1-RELEASE without
problem.

Threads are always stuck with the stack around
vn_lock->zfs_root->lookup->namei. When the system is in this state, a
simple `ls /` or `ls /tmp` always hangs, but other datasets seem
unaffected. I have a fairly straightforward ZFS root setup on a single pool
with one SSD. The workload is a ruby/rails/nginx/postgresql backed web
application combined with some data warehousing and other periodic tasks.

Sometimes I can remote SSH in, other times that fails because the user
shell fails to load, and I can run commands via `ssh ... command`.
Sometimes the system is not accessible remotely at all, or it eventually
becomes inaccessible if left long enough in this state. I always have to
physically reboot the device because the shutdown procedure fails. The
network stack (e.g. ping) seems to work completely fine whilst this is
going on, until you try to interact with or spawn a process that hits the
deadlock.

Like previous similar ZFS deadlock issues, increasing kern.vnodes seems to
make the system last longer by up to a few weeks, but is still a bandaid.
However, I have yet to witness vnodes usage actually getting close to the
maximum.

I haven't had any luck reproducing this reliably, but eventually it happens
after a few days or a few weeks... I managed to connect to a system in this
state and grab a procstat and get (hopefully) something useful out of kgdb.
I will note that although I was able to install debug symbols, I couldn't
manage to get the source files onto it for kgdb purposes before I lost SSH
access.

I am hoping someone can help me figure out if this is a legitimate bug, or
something already fixed in 12-STABLE. I wish I could reproduce it reliably
to try against STABLE, but there doesn't appear to be any related ZFS fixes
not in RELEASE. Thanks.

Below is an abbreviated procstat and what I was able to get out of kgdb for
an affected thread. Note that the thread backtrace is from a simple `ls`
command. The procstat dump below is truncated because my last attempt to
send this was rejected by this list for being too long - so a number of
sh/cron processes and some zfs threads in a hung state were removed.

ld# kgdb
GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from
/usr/lib/debug//boot/kernel/kernel.debug...done.
done.
sched_switch (td=0xf8002452a000, newtd=0xf80003625580,
flags=)
at /usr/src/sys/kern/sched_ule.c:2112
2112 /usr/src/sys/kern/sched_ule.c: No such file or directory.
(kgdb) tid 102023
(kgdb) bt
#0  sched_switch (td=0xf801a83dc580, newtd=0xf80003550580,
flags=)
at /usr/src/sys/kern/sched_ule.c:2112
#1  0x80d0e0a1 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:439
#2  0x80d5c80c in sleepq_wait (wchan=,
pri=)
at /usr/src/sys/kern/subr_sleepqueue.c:692
#3  0x80cd9105 in sleeplk (lk=0xf800247307e8, flags=, ilk=,
wmesg=, pri=, timo=51, queue=1) at
/usr/src/sys/kern/kern_lock.c:300
#4  0x80cd7f85 in lockmgr_slock_hard (lk=,
flags=, ilk=,
file=, line=0, lwa=) at
/usr/src/sys/kern/kern_lock.c:646
#5  0x813acc5e in VOP_LOCK1_APV (vop=,
a=0xfe00f89dd450) at vnode_if.c:2087
#6  0x80de2820 in VOP_LOCK1 (vp=0xf80024730780, flags=2105344,
file=0x814d4f74
"/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c",
line=2074)
at ./vnode_if.h:859
#7  _vn_lock (vp=0xf80024730780, flags=2105344,
file=0x814d4f74
"/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c",
line=2074)
at /usr/src/sys/kern/vfs_vnops.c:1533
#8  0x8049f68d in zfs_root (vfsp=, flags=2105344,
vpp=0xfe00f89dd558)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:2074
#9  0x80dc5d43 in lookup (ndp=0xfe00f89dd780) at
/usr/src/sys/kern/vfs_lookup.c:961
#10 0x80dc4f9b in namei (ndp=0xfe00f89dd780) at
/usr/src/sys/kern/vfs_lookup.c:444
#11 0x80ddc637 in