Re: deadlkres() panic

2010-07-07 Thread Attilio Rao
2010/7/1 Bryan Venteicher bry...@daemoninthecloset.org:
 On a recent -current, I got the following panic from deadlkres:

 Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680

 Tracing pid 0 tid 100058 td 0xff00024bf7a0
 kdb_enter() at kdb_enter+0x3d
 panic() at panic+0x176
 sleepq_type() at sleepq_type+0x56
 deadlkres() at deadlkres+0x224
 fork_exit() at fork_exit+0x12a
 fork_trampoline() at fork_trampoline+0xe
 --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 ---
 (Hand transcribed, doadump() hung)

 deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a
 sleepqueue (ie, td-td_wchan == NULL).

 I don't think this is an invalid state for thread to be in: After adding 
 itself
 to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig().
 sleepq_catch_signals() determines there is a signal pending so it removes the
 thread from the sleepq via sleepq_resume_thread(). Returning to
 sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is
 unable to cancel the timeout because it is already firing (likely waiting on
 thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch().
 deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() 
 !TD_ON_SLEEPQ().

 The attached patch takes care of the panic for me.

I think that your analysis and patch are both fine and are committed,
along with a small cleanup, as r209761.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: deadlkres() panic

2010-07-07 Thread Marius NĂ¼nnerich
On Wed, Jul 7, 2010 at 14:01, Attilio Rao atti...@freebsd.org wrote:
 2010/7/1 Bryan Venteicher bry...@daemoninthecloset.org:
 On a recent -current, I got the following panic from deadlkres:

 Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680

 Tracing pid 0 tid 100058 td 0xff00024bf7a0
 kdb_enter() at kdb_enter+0x3d
 panic() at panic+0x176
 sleepq_type() at sleepq_type+0x56
 deadlkres() at deadlkres+0x224
 fork_exit() at fork_exit+0x12a
 fork_trampoline() at fork_trampoline+0xe
 --- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 ---
 (Hand transcribed, doadump() hung)

 deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a
 sleepqueue (ie, td-td_wchan == NULL).

 I don't think this is an invalid state for thread to be in: After adding 
 itself
 to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig().
 sleepq_catch_signals() determines there is a signal pending so it removes the
 thread from the sleepq via sleepq_resume_thread(). Returning to
 sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is
 unable to cancel the timeout because it is already firing (likely waiting on
 thread_lock()). So the thread calls TD_SET_SLEEPING() followed by 
 mi_switch().
 deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() 
 !TD_ON_SLEEPQ().

 The attached patch takes care of the panic for me.

 I think that your analysis and patch are both fine and are committed,
 along with a small cleanup, as r209761.

Thank you both, I guess a had that panic a few days ago. Updating right now.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


deadlkres() panic

2010-07-01 Thread Bryan Venteicher
On a recent -current, I got the following panic from deadlkres:

Assertion wchan != NULL failed at /usr/src-nfs/sys/kern/subr_sleepqueue.c:680

Tracing pid 0 tid 100058 td 0xff00024bf7a0
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x176
sleepq_type() at sleepq_type+0x56
deadlkres() at deadlkres+0x224
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff8074976d30, rbp = 0 ---
(Hand transcribed, doadump() hung)

deadlkres() came across a TD_IS_SLEEPING()'ing thread that was not a
sleepqueue (ie, td-td_wchan == NULL).

I don't think this is an invalid state for thread to be in: After adding itself
to a sleepq and setting a timeout, the thread calls sleepq_timedwait_sig().
sleepq_catch_signals() determines there is a signal pending so it removes the
thread from the sleepq via sleepq_resume_thread(). Returning to
sleepq_timedwait_sig(), in the call to sleepq_check_timeout(), the thread is
unable to cancel the timeout because it is already firing (likely waiting on
thread_lock()). So the thread calls TD_SET_SLEEPING() followed by mi_switch().
deadlkres() then picks up thread_lock(), finding td is TD_IS_SLEEPING() 
!TD_ON_SLEEPQ().

The attached patch takes care of the panic for me.--- /usr/src-nfs/sys/kern/kern_clock.c	2010-06-30 03:38:25.0 -0500
+++ kern_clock.c	2010-07-01 02:19:39.048697991 -0500
@@ -232,7 +232,8 @@
 	panic(%s: possible deadlock detected for %p, blocked for %d ticks\n,
 		__func__, td, tticks);
 	}
-} else if (TD_IS_SLEEPING(td)) {
+} else if (TD_IS_SLEEPING(td) 
+TD_ON_SLEEPQ(td)) {
 
 	/* Handle ticks wrap-up. */
 	if (ticks  td-td_blktick) {
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org