Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread O. Hartmann
On 12/20/11 16:55, Robert Watson wrote:
> On Tue, 20 Dec 2011, Attilio Rao wrote:
> 
>> As we are here, however, I have a question for Robert here: do you
>> think we should support the _ddb() variant of options even in the case
>> DDB is not enabled in the kernel?
> 
> It's possible that _ddb() should be spelled _unlocked(), or perhaps
> _debug(), but neither really suggests what the name should actually
> imply: using it is safe only in a marginal (debugging) sense, and not in
> a production code sense. One might also reasonable call them
> stack_foo_dontusethis().
> 
> The _ddb() variants are used in at least two not strictly DDB cases:
> redzone support, and Solaris memory allocation.  And, I guess, the
> current lock debugging case that we're talking about now, but I'm not
> sure if those debugging features specifically require DDB in the kernel
> themselves?
> 
> Robert


Sorry, I have to come back to the topic of my hijacked thread ;-)

I realized that the problem occured when I enabled, just for fun, some
features in the kernel config file, in particular were these

device  ipmi
device  smbios
device  vdp
device  tpm

My box in question does not have any IPMI facility, nor does it have a
TPM chip or vdp. So I commented out vdp and tpm and the problem seems to
be gone.

I didn't find any manpage for smbios, the NOTES in amd64 says, it is for
DMI informations, but dmidecide works also without explicitely enabling
smbios in the kernel.

If someone wish that I have to perform another try with those modules
enabled, let me know.

oh



signature.asc
Description: OpenPGP digital signature


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread mdf
On Tue, Dec 20, 2011 at 6:32 AM, John Baldwin  wrote:
> On Tuesday, December 20, 2011 9:22:48 am m...@freebsd.org wrote:
>> On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin  wrote:
>> > On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
>> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  
>> >> wrote:
>> >> > On Sun, 18 Dec 2011 01:09:00 +0100
>> >> > "O. Hartmann"  wrote:
>> >> >
>> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> >> panic: sleeping thread
>> >> >> cpuid = 0
>> >> >>
>> >> >> PID 16 is always USB on my box.
>> >> >
>> >> > You really need to give us a backtrace when you quote panics. It is
>> >> > impossible to make any sense of the above panic message without more
>> >> > context.
>> >>
>> >> In the case of this panic, the stack of the thread which panics is
>> >> useless; it's someone trying to propagate priority that discovered it.
>> >>  A backtrace on tid 100033 would be useful.
>> >>
>> >> With WITNESS enabled, it's possible to have this panic display the
>> >> stack of the incorrectly sleeping thread at the time it acquired the
>> >> lock, as well, but this code isn't in CURRENT or any release.  I have
>> >> a patch at $WORK I can dig up on Monday.
>> >
>> > Huh?  The stock kernel dumps a stack trace of the offending thread if you 
>> > have
>> > DDB enabled:
>> >
>> >                /*
>> >                 * If the thread is asleep, then we are probably about
>> >                 * to deadlock.  To make debugging this easier, just
>> >                 * panic and tell the user which thread misbehaved so
>> >                 * they can hopefully get a stack trace from the truly
>> >                 * misbehaving thread.
>> >                 */
>> >                if (TD_IS_SLEEPING(td)) {
>> >                        printf(
>> >                "Sleeping thread (tid %d, pid %d) owns a non-sleepable 
>> > lock\n",
>> >                            td->td_tid, td->td_proc->p_pid);
>> > #ifdef DDB
>> >                        db_trace_thread(td, -1);
>> > #endif
>> >                        panic("sleeping thread");
>> >                }
>>
>> Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
>> added code to print *which* lock it holds (using WITNESS data).  I do
>> recall that this panic alone was often not sufficient to debug the
>> problem.
>
> I think the db_trace_thread() has been around for a while (since 5 or 6),
> but it is true that we don't tell you which lock is held even with this.
> That might be a useful thing to output before the panic.


This patch isn't quite right since I had to hand-edit it.  There's a
small chance I can commit this in the near future, but of someone else
wants to take it, feel free.  Style isn't yet fixed up to be FreeBSD
standard either.


--- /data/sb/bsd.git/sys/kern/subr_turnstile.c  2011-12-12
10:23:12.542196632 -0800
+++ kern/subr_turnstile.c   2011-12-09 10:59:29.882643558 -0800
@@ -165,10 +165,43 @@
 static voidturnstile_dtor(void *mem, int size, void *arg);
 #endif
 static int turnstile_init(void *mem, int size, int flags);
 static voidturnstile_fini(void *mem, int size);

+#ifdef INVARIANTS
+static void
+sleeping_thread_owns_a_nonsleepable_lock(struct thread *td)
+{
+   printf("Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
+   td->td_tid, td->td_proc->p_pid);
+#ifdef DDB
+   db_trace_thread(td, -1);
+#endif
+#ifdef WITNESS
+   struct lock_list_entry *lock_list, *lle;
+   int i;
+
+   lock_list = td->td_sleeplocks;
+   if (lock_list == NULL || lock_list->ll_count == 0) {
+   printf("Thread does not appear to hold any mutexes!\n");
+   return;
+   }
+
+   for (lle = lock_list; lle != NULL; lle = lle->ll_next) {
+   for (i = lle->ll_count - 1; i >= 0; i--) {
+   struct lock_instance *li = &lle->ll_children[i];
+
+   printf("Lock %s acquired at %s:%d\n",
+   li->li_lock->lo_name, li->li_file, li->li_line);
+   }
+   }
+#endif /* WITNESS */
+}
+#else
+#define sleeping_thread_owns_a_nonsleepable_lock(td) do { } while (0)
+#endif /* INVARIANTS */
+
 /*
  * Walks the chain of turnstiles and their owners to propagate the priority
  * of the thread being blocked to all the threads holding locks that have to
  * release their locks before this thread can run again.
  */
@@ -210,19 +243,31 @@
 * If the thread is asleep, then we are probably about
 * to deadlock.  To make debugging this easier, just
 * panic and tell the user which thread misbehaved so
 * they can hopefully get a stack trace from the truly
 * misbehaving thread.
 */
if (TD_IS_SLEEPING(td)) {
-   printf(
-   "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
-   td->

Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread Robert Watson

On Tue, 20 Dec 2011, Attilio Rao wrote:

As we are here, however, I have a question for Robert here: do you think we 
should support the _ddb() variant of options even in the case DDB is not 
enabled in the kernel?


It's possible that _ddb() should be spelled _unlocked(), or perhaps _debug(), 
but neither really suggests what the name should actually imply: using it is 
safe only in a marginal (debugging) sense, and not in a production code sense. 
One might also reasonable call them stack_foo_dontusethis().


The _ddb() variants are used in at least two not strictly DDB cases: redzone 
support, and Solaris memory allocation.  And, I guess, the current lock 
debugging case that we're talking about now, but I'm not sure if those 
debugging features specifically require DDB in the kernel themselves?


Robert
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread Attilio Rao
2011/12/20 John Baldwin :
> On Tuesday, December 20, 2011 9:20:09 am Attilio Rao wrote:
>> 2011/12/20 John Baldwin :
>> > On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
>> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  
>> >> wrote:
>> >> > On Sun, 18 Dec 2011 01:09:00 +0100
>> >> > "O. Hartmann"  wrote:
>> >> >
>> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> >> panic: sleeping thread
>> >> >> cpuid = 0
>> >> >>
>> >> >> PID 16 is always USB on my box.
>> >> >
>> >> > You really need to give us a backtrace when you quote panics. It is
>> >> > impossible to make any sense of the above panic message without more
>> >> > context.
>> >>
>> >> In the case of this panic, the stack of the thread which panics is
>> >> useless; it's someone trying to propagate priority that discovered it.
>> >>  A backtrace on tid 100033 would be useful.
>> >>
>> >> With WITNESS enabled, it's possible to have this panic display the
>> >> stack of the incorrectly sleeping thread at the time it acquired the
>> >> lock, as well, but this code isn't in CURRENT or any release.  I have
>> >> a patch at $WORK I can dig up on Monday.
>> >
>> > Huh?  The stock kernel dumps a stack trace of the offending thread if you 
>> > have
>> > DDB enabled:
>> >
>> >                /*
>> >                 * If the thread is asleep, then we are probably about
>> >                 * to deadlock.  To make debugging this easier, just
>> >                 * panic and tell the user which thread misbehaved so
>> >                 * they can hopefully get a stack trace from the truly
>> >                 * misbehaving thread.
>> >                 */
>> >                if (TD_IS_SLEEPING(td)) {
>> >                        printf(
>> >                "Sleeping thread (tid %d, pid %d) owns a non-sleepable 
>> > lock\n",
>> >                            td->td_tid, td->td_proc->p_pid);
>> > #ifdef DDB
>> >                        db_trace_thread(td, -1);
>> > #endif
>> >                        panic("sleeping thread");
>> >                }
>> >
>> > It may be that we can make use of the STACK API here instead to output this
>> > trace even when DDB isn't enabled.  The patch below tries to do that
>> > (untested).  It does some odd thigns though since it is effectively running
>> > from a panic context already, so it uses a statically allocated 'struct 
>> > stack'
>> > rather than using stack_create() and uses stack_print_ddb() since it is
>> > holding spin locks and can't possibly grab an sx lock:
>> >
>> > Index: subr_turnstile.c
>> > ===
>> > --- subr_turnstile.c    (revision 228534)
>> > +++ subr_turnstile.c    (working copy)
>> > @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
>> >  #include 
>> >  #include 
>> >  #include 
>> > +#include 
>> >  #include 
>> >  #include 
>> >
>> > @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
>> >  static void
>> >  propagate_priority(struct thread *td)
>> >  {
>> > +       static struct stack st;
>> >        struct turnstile *ts;
>> >        int pri;
>> >
>> > @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
>> >                        printf(
>> >                "Sleeping thread (tid %d, pid %d) owns a non-sleepable 
>> > lock\n",
>> >                            td->td_tid, td->td_proc->p_pid);
>> > -#ifdef DDB
>> > -                       db_trace_thread(td, -1);
>> > +#ifdef STACK
>> > +                       stack_zero(&st);
>> > +                       stack_save_td(&st, td);
>> > +                       stack_print_ddb(&st);
>> >  #endif
>> >                        panic("sleeping thread");
>> >                }
>> >
>> > --
>>
>> I'm not sure it is a wise idea to trimm the DDB part, because it is
>> much more common than STACK enabled. Note that stack(9) is working if
>> you define DDB too, so I'd say to do that for both.
>> Also, I don't think you need the stack_zero() prior to set it.
>
> Err, STACK is enabled in GENERIC in release kernels but DDB is not, so I think
> STACK is the more common one.  As far as stack_zero(), I was just being 
> paranoid.

And what is the point for not having
#ifdef STACK
as
#if defined(STACK) || defined(DDB)

?

>> As we are here, however, I have a question for Robert here: do you
>> think we should support the _ddb() variant of options even in the case
>> DDB is not enabled in the kernel?
>> Probabilly the way it is nowadays is easier to have stack(9) both
>> defined for DDB and STACK, but in general I wouldn't advertise that.
>
> The _ddb variants are always enabled by my reading.  They just use different
> entry points into the linker that don't use locking.

My question is different: why we define them anyway even when DDB is
not enabled?

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinf

Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread John Baldwin
On Tuesday, December 20, 2011 9:20:09 am Attilio Rao wrote:
> 2011/12/20 John Baldwin :
> > On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
> >> > On Sun, 18 Dec 2011 01:09:00 +0100
> >> > "O. Hartmann"  wrote:
> >> >
> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
> >> >> panic: sleeping thread
> >> >> cpuid = 0
> >> >>
> >> >> PID 16 is always USB on my box.
> >> >
> >> > You really need to give us a backtrace when you quote panics. It is
> >> > impossible to make any sense of the above panic message without more
> >> > context.
> >>
> >> In the case of this panic, the stack of the thread which panics is
> >> useless; it's someone trying to propagate priority that discovered it.
> >>  A backtrace on tid 100033 would be useful.
> >>
> >> With WITNESS enabled, it's possible to have this panic display the
> >> stack of the incorrectly sleeping thread at the time it acquired the
> >> lock, as well, but this code isn't in CURRENT or any release.  I have
> >> a patch at $WORK I can dig up on Monday.
> >
> > Huh?  The stock kernel dumps a stack trace of the offending thread if you 
> > have
> > DDB enabled:
> >
> >/*
> > * If the thread is asleep, then we are probably about
> > * to deadlock.  To make debugging this easier, just
> > * panic and tell the user which thread misbehaved so
> > * they can hopefully get a stack trace from the truly
> > * misbehaving thread.
> > */
> >if (TD_IS_SLEEPING(td)) {
> >printf(
> >"Sleeping thread (tid %d, pid %d) owns a non-sleepable 
> > lock\n",
> >td->td_tid, td->td_proc->p_pid);
> > #ifdef DDB
> >db_trace_thread(td, -1);
> > #endif
> >panic("sleeping thread");
> >}
> >
> > It may be that we can make use of the STACK API here instead to output this
> > trace even when DDB isn't enabled.  The patch below tries to do that
> > (untested).  It does some odd thigns though since it is effectively running
> > from a panic context already, so it uses a statically allocated 'struct 
> > stack'
> > rather than using stack_create() and uses stack_print_ddb() since it is
> > holding spin locks and can't possibly grab an sx lock:
> >
> > Index: subr_turnstile.c
> > ===
> > --- subr_turnstile.c(revision 228534)
> > +++ subr_turnstile.c(working copy)
> > @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >
> > @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
> >  static void
> >  propagate_priority(struct thread *td)
> >  {
> > +   static struct stack st;
> >struct turnstile *ts;
> >int pri;
> >
> > @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
> >printf(
> >"Sleeping thread (tid %d, pid %d) owns a non-sleepable 
> > lock\n",
> >td->td_tid, td->td_proc->p_pid);
> > -#ifdef DDB
> > -   db_trace_thread(td, -1);
> > +#ifdef STACK
> > +   stack_zero(&st);
> > +   stack_save_td(&st, td);
> > +   stack_print_ddb(&st);
> >  #endif
> >panic("sleeping thread");
> >}
> >
> > --
> 
> I'm not sure it is a wise idea to trimm the DDB part, because it is
> much more common than STACK enabled. Note that stack(9) is working if
> you define DDB too, so I'd say to do that for both.
> Also, I don't think you need the stack_zero() prior to set it.

Err, STACK is enabled in GENERIC in release kernels but DDB is not, so I think
STACK is the more common one.  As far as stack_zero(), I was just being 
paranoid.

> As we are here, however, I have a question for Robert here: do you
> think we should support the _ddb() variant of options even in the case
> DDB is not enabled in the kernel?
> Probabilly the way it is nowadays is easier to have stack(9) both
> defined for DDB and STACK, but in general I wouldn't advertise that.

The _ddb variants are always enabled by my reading.  They just use different
entry points into the linker that don't use locking.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread John Baldwin
On Tuesday, December 20, 2011 9:22:48 am m...@freebsd.org wrote:
> On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin  wrote:
> > On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
> >> > On Sun, 18 Dec 2011 01:09:00 +0100
> >> > "O. Hartmann"  wrote:
> >> >
> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
> >> >> panic: sleeping thread
> >> >> cpuid = 0
> >> >>
> >> >> PID 16 is always USB on my box.
> >> >
> >> > You really need to give us a backtrace when you quote panics. It is
> >> > impossible to make any sense of the above panic message without more
> >> > context.
> >>
> >> In the case of this panic, the stack of the thread which panics is
> >> useless; it's someone trying to propagate priority that discovered it.
> >>  A backtrace on tid 100033 would be useful.
> >>
> >> With WITNESS enabled, it's possible to have this panic display the
> >> stack of the incorrectly sleeping thread at the time it acquired the
> >> lock, as well, but this code isn't in CURRENT or any release.  I have
> >> a patch at $WORK I can dig up on Monday.
> >
> > Huh?  The stock kernel dumps a stack trace of the offending thread if you 
> > have
> > DDB enabled:
> >
> >/*
> > * If the thread is asleep, then we are probably about
> > * to deadlock.  To make debugging this easier, just
> > * panic and tell the user which thread misbehaved so
> > * they can hopefully get a stack trace from the truly
> > * misbehaving thread.
> > */
> >if (TD_IS_SLEEPING(td)) {
> >printf(
> >"Sleeping thread (tid %d, pid %d) owns a non-sleepable 
> > lock\n",
> >td->td_tid, td->td_proc->p_pid);
> > #ifdef DDB
> >db_trace_thread(td, -1);
> > #endif
> >panic("sleeping thread");
> >}
> 
> Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
> added code to print *which* lock it holds (using WITNESS data).  I do
> recall that this panic alone was often not sufficient to debug the
> problem.

I think the db_trace_thread() has been around for a while (since 5 or 6),
but it is true that we don't tell you which lock is held even with this.
That might be a useful thing to output before the panic.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread mdf
On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin  wrote:
> On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
>> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
>> > On Sun, 18 Dec 2011 01:09:00 +0100
>> > "O. Hartmann"  wrote:
>> >
>> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> panic: sleeping thread
>> >> cpuid = 0
>> >>
>> >> PID 16 is always USB on my box.
>> >
>> > You really need to give us a backtrace when you quote panics. It is
>> > impossible to make any sense of the above panic message without more
>> > context.
>>
>> In the case of this panic, the stack of the thread which panics is
>> useless; it's someone trying to propagate priority that discovered it.
>>  A backtrace on tid 100033 would be useful.
>>
>> With WITNESS enabled, it's possible to have this panic display the
>> stack of the incorrectly sleeping thread at the time it acquired the
>> lock, as well, but this code isn't in CURRENT or any release.  I have
>> a patch at $WORK I can dig up on Monday.
>
> Huh?  The stock kernel dumps a stack trace of the offending thread if you have
> DDB enabled:
>
>                /*
>                 * If the thread is asleep, then we are probably about
>                 * to deadlock.  To make debugging this easier, just
>                 * panic and tell the user which thread misbehaved so
>                 * they can hopefully get a stack trace from the truly
>                 * misbehaving thread.
>                 */
>                if (TD_IS_SLEEPING(td)) {
>                        printf(
>                "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
>                            td->td_tid, td->td_proc->p_pid);
> #ifdef DDB
>                        db_trace_thread(td, -1);
> #endif
>                        panic("sleeping thread");
>                }

Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
added code to print *which* lock it holds (using WITNESS data).  I do
recall that this panic alone was often not sufficient to debug the
problem.

Thanks,
matthew


> It may be that we can make use of the STACK API here instead to output this
> trace even when DDB isn't enabled.  The patch below tries to do that
> (untested).  It does some odd thigns though since it is effectively running
> from a panic context already, so it uses a statically allocated 'struct stack'
> rather than using stack_create() and uses stack_print_ddb() since it is
> holding spin locks and can't possibly grab an sx lock:
>
> Index: subr_turnstile.c
> ===
> --- subr_turnstile.c    (revision 228534)
> +++ subr_turnstile.c    (working copy)
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
>  static void
>  propagate_priority(struct thread *td)
>  {
> +       static struct stack st;
>        struct turnstile *ts;
>        int pri;
>
> @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
>                        printf(
>                "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
>                            td->td_tid, td->td_proc->p_pid);
> -#ifdef DDB
> -                       db_trace_thread(td, -1);
> +#ifdef STACK
> +                       stack_zero(&st);
> +                       stack_save_td(&st, td);
> +                       stack_print_ddb(&st);
>  #endif
>                        panic("sleeping thread");
>                }
>
> --
> John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread Attilio Rao
2011/12/20 John Baldwin :
> On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
>> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
>> > On Sun, 18 Dec 2011 01:09:00 +0100
>> > "O. Hartmann"  wrote:
>> >
>> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> panic: sleeping thread
>> >> cpuid = 0
>> >>
>> >> PID 16 is always USB on my box.
>> >
>> > You really need to give us a backtrace when you quote panics. It is
>> > impossible to make any sense of the above panic message without more
>> > context.
>>
>> In the case of this panic, the stack of the thread which panics is
>> useless; it's someone trying to propagate priority that discovered it.
>>  A backtrace on tid 100033 would be useful.
>>
>> With WITNESS enabled, it's possible to have this panic display the
>> stack of the incorrectly sleeping thread at the time it acquired the
>> lock, as well, but this code isn't in CURRENT or any release.  I have
>> a patch at $WORK I can dig up on Monday.
>
> Huh?  The stock kernel dumps a stack trace of the offending thread if you have
> DDB enabled:
>
>                /*
>                 * If the thread is asleep, then we are probably about
>                 * to deadlock.  To make debugging this easier, just
>                 * panic and tell the user which thread misbehaved so
>                 * they can hopefully get a stack trace from the truly
>                 * misbehaving thread.
>                 */
>                if (TD_IS_SLEEPING(td)) {
>                        printf(
>                "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
>                            td->td_tid, td->td_proc->p_pid);
> #ifdef DDB
>                        db_trace_thread(td, -1);
> #endif
>                        panic("sleeping thread");
>                }
>
> It may be that we can make use of the STACK API here instead to output this
> trace even when DDB isn't enabled.  The patch below tries to do that
> (untested).  It does some odd thigns though since it is effectively running
> from a panic context already, so it uses a statically allocated 'struct stack'
> rather than using stack_create() and uses stack_print_ddb() since it is
> holding spin locks and can't possibly grab an sx lock:
>
> Index: subr_turnstile.c
> ===
> --- subr_turnstile.c    (revision 228534)
> +++ subr_turnstile.c    (working copy)
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
>  static void
>  propagate_priority(struct thread *td)
>  {
> +       static struct stack st;
>        struct turnstile *ts;
>        int pri;
>
> @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
>                        printf(
>                "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
>                            td->td_tid, td->td_proc->p_pid);
> -#ifdef DDB
> -                       db_trace_thread(td, -1);
> +#ifdef STACK
> +                       stack_zero(&st);
> +                       stack_save_td(&st, td);
> +                       stack_print_ddb(&st);
>  #endif
>                        panic("sleeping thread");
>                }
>
> --

I'm not sure it is a wise idea to trimm the DDB part, because it is
much more common than STACK enabled. Note that stack(9) is working if
you define DDB too, so I'd say to do that for both.
Also, I don't think you need the stack_zero() prior to set it.

As we are here, however, I have a question for Robert here: do you
think we should support the _ddb() variant of options even in the case
DDB is not enabled in the kernel?
Probabilly the way it is nowadays is easier to have stack(9) both
defined for DDB and STACK, but in general I wouldn't advertise that.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread Andriy Gapon
on 20/12/2011 15:52 John Baldwin said the following:
> -#ifdef DDB
> - db_trace_thread(td, -1);
> +#ifdef STACK
> + stack_zero(&st);
> + stack_save_td(&st, td);
> + stack_print_ddb(&st);
>  #endif

This leads to an idea - what about an umbrella stack_print() that would call the
best stack printing routine, if any, based on their availabilities (STACK, DDB,
etc?).

P.S. Sorry to hijack the thread.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread John Baldwin
On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
> > On Sun, 18 Dec 2011 01:09:00 +0100
> > "O. Hartmann"  wrote:
> >
> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
> >> panic: sleeping thread
> >> cpuid = 0
> >>
> >> PID 16 is always USB on my box.
> >
> > You really need to give us a backtrace when you quote panics. It is
> > impossible to make any sense of the above panic message without more
> > context.
> 
> In the case of this panic, the stack of the thread which panics is
> useless; it's someone trying to propagate priority that discovered it.
>  A backtrace on tid 100033 would be useful.
> 
> With WITNESS enabled, it's possible to have this panic display the
> stack of the incorrectly sleeping thread at the time it acquired the
> lock, as well, but this code isn't in CURRENT or any release.  I have
> a patch at $WORK I can dig up on Monday.

Huh?  The stock kernel dumps a stack trace of the offending thread if you have 
DDB enabled:

/*
 * If the thread is asleep, then we are probably about
 * to deadlock.  To make debugging this easier, just
 * panic and tell the user which thread misbehaved so
 * they can hopefully get a stack trace from the truly
 * misbehaving thread.
 */
if (TD_IS_SLEEPING(td)) {
printf(
"Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
td->td_tid, td->td_proc->p_pid);
#ifdef DDB
db_trace_thread(td, -1);
#endif
panic("sleeping thread");
}

It may be that we can make use of the STACK API here instead to output this
trace even when DDB isn't enabled.  The patch below tries to do that 
(untested).  It does some odd thigns though since it is effectively running
from a panic context already, so it uses a statically allocated 'struct stack'
rather than using stack_create() and uses stack_print_ddb() since it is 
holding spin locks and can't possibly grab an sx lock:

Index: subr_turnstile.c
===
--- subr_turnstile.c(revision 228534)
+++ subr_turnstile.c(working copy)
@@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
 static void
 propagate_priority(struct thread *td)
 {
+   static struct stack st;
struct turnstile *ts;
int pri;
 
@@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
printf(
"Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
td->td_tid, td->td_proc->p_pid);
-#ifdef DDB
-   db_trace_thread(td, -1);
+#ifdef STACK
+   stack_zero(&st);
+   stack_save_td(&st, td);
+   stack_print_ddb(&st);
 #endif
panic("sleeping thread");
}

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-18 Thread O. Hartmann
On 12/18/11 02:45, Alexander Kabaev wrote:
> On Sun, 18 Dec 2011 01:09:00 +0100
> "O. Hartmann"  wrote:
> 
>> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> panic: sleeping thread
>> cpuid = 0
>>
>> PID 16 is always USB on my box.
> 
> You really need to give us a backtrace when you quote panics. It is
> impossible to make any sense of the above panic message without more
> context.
> 

Sorry,
will do this immediately when I'm back in the lab.

Regards,
Oliver



signature.asc
Description: OpenPGP digital signature


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-17 Thread mdf
On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev  wrote:
> On Sun, 18 Dec 2011 01:09:00 +0100
> "O. Hartmann"  wrote:
>
>> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> panic: sleeping thread
>> cpuid = 0
>>
>> PID 16 is always USB on my box.
>
> You really need to give us a backtrace when you quote panics. It is
> impossible to make any sense of the above panic message without more
> context.

In the case of this panic, the stack of the thread which panics is
useless; it's someone trying to propagate priority that discovered it.
 A backtrace on tid 100033 would be useful.

With WITNESS enabled, it's possible to have this panic display the
stack of the incorrectly sleeping thread at the time it acquired the
lock, as well, but this code isn't in CURRENT or any release.  I have
a patch at $WORK I can dig up on Monday.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-17 Thread Alexander Kabaev
On Sun, 18 Dec 2011 01:09:00 +0100
"O. Hartmann"  wrote:

> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
> panic: sleeping thread
> cpuid = 0
> 
> PID 16 is always USB on my box.

You really need to give us a backtrace when you quote panics. It is
impossible to make any sense of the above panic message without more
context.

-- 
Alexander Kabaev


signature.asc
Description: PGP signature