Re: deadlkres() panic
2010/7/1 Bryan Venteicher bry...@daemoninthecloset.org: On a recent -current, I got the following panic from deadlkres: Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680 Tracing pid 0 tid 100058 td 0xff00024bf7a0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x176 sleepq_type() at sleepq_type+0x56 deadlkres() at deadlkres+0x224 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 --- (Hand transcribed, doadump() hung) deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a sleepqueue (ie, td-td_wchan == NULL). I don't think this is an invalid state for thread to be in: After adding itself to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig(). sleepq_catch_signals() determines there is a signal pending so it removes the thread from the sleepq via sleepq_resume_thread(). Returning to sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is unable to cancel the timeout because it is already firing (likely waiting on thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch(). deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() !TD_ON_SLEEPQ(). The attached patch takes care of the panic for me. I think that your analysis and patch are both fine and are committed, along with a small cleanup, as r209761. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: deadlkres() panic
On Wed, Jul 7, 2010 at 14:01, Attilio Rao atti...@freebsd.org wrote: 2010/7/1 Bryan Venteicher bry...@daemoninthecloset.org: On a recent -current, I got the following panic from deadlkres: Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680 Tracing pid 0 tid 100058 td 0xff00024bf7a0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x176 sleepq_type() at sleepq_type+0x56 deadlkres() at deadlkres+0x224 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 --- (Hand transcribed, doadump() hung) deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a sleepqueue (ie, td-td_wchan == NULL). I don't think this is an invalid state for thread to be in: After adding itself to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig(). sleepq_catch_signals() determines there is a signal pending so it removes the thread from the sleepq via sleepq_resume_thread(). Returning to sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is unable to cancel the timeout because it is already firing (likely waiting on thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch(). deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() !TD_ON_SLEEPQ(). The attached patch takes care of the panic for me. I think that your analysis and patch are both fine and are committed, along with a small cleanup, as r209761. Thank you both, I guess a had that panic a few days ago. Updating right now. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
deadlkres() panic
On a recent -current, I got the following panic from deadlkres: Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680 Tracing pid 0 tid 100058 td 0xff00024bf7a0 kdb_enter() at kdb_enter+0x3d panic() at panic+0x176 sleepq_type() at sleepq_type+0x56 deadlkres() at deadlkres+0x224 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 --- (Hand transcribed, doadump() hung) deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a sleepqueue (ie, td-td_wchan == NULL). I don't think this is an invalid state for thread to be in: After adding itself to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig(). sleepq_catch_signals() determines there is a signal pending so it removes the thread from the sleepq via sleepq_resume_thread(). Returning to sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is unable to cancel the timeout because it is already firing (likely waiting on thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch(). deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() !TD_ON_SLEEPQ(). The attached patch takes care of the panic for me.--- /usr/src-nfs/sys/kern/kern_clock.c 2010-06-30 03:38:25.0 -0500 +++ kern_clock.c 2010-07-01 02:19:39.048697991 -0500 @@ -232,7 +232,8 @@ panic(%s: possible deadlock detected for %p, blocked for %d ticks\n, __func__, td, tticks); } -} else if (TD_IS_SLEEPING(td)) { +} else if (TD_IS_SLEEPING(td) +TD_ON_SLEEPQ(td)) { /* Handle ticks wrap-up. */ if (ticks td-td_blktick) { ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org