On Wed, 12 Feb 2014, Stefani Seibold wrote:
> > > > >> Okay, the debugging info in your dmesg log indicates the cause of the
> > > > >> problem. It looks like the bug is related to commit 88ed9fd50e57
> > > > >> (usb/hcd: remove unnecessary local_irq_save) by Michael Opdenacker.
For the benefit of people who haven't seen the log, here is the
important part:
[ 3.431781] [ INFO: inconsistent lock state ]
[ 3.431784] 3.13.2 #4 Not tainted
[ 3.431786] ---------------------------------
[ 3.431788] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[ 3.431792] swapper/3/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 3.431794] (&(&ehci->lock)->rlock){?.-...}, at: [<c04c9ad1>]
ehci_hrtimer_func+0x21/0xc0
[ 3.431805] {HARDIRQ-ON-W} state was registered at:
[ 3.431807] [<c0183750>] __lock_acquire+0x590/0x1bb0
[ 3.431813] [<c01852fe>] lock_acquire+0x7e/0x110
[ 3.431818] [<c06bef4f>] _raw_spin_lock+0x3f/0x50
[ 3.431833] [<c04d0ce7>] ehci_irq+0x27/0x3d0
[ 3.431835] [<c04b2521>] usb_hcd_irq+0x21/0x30
[ 3.431839] [<c018f596>] irq_forced_thread_fn+0x26/0x50
[ 3.431842] [<c018f38e>] irq_thread+0xfe/0x130
[ 3.431844] [<c015b5bb>] kthread+0x9b/0xb0
This says that ehci->lock was acquired by ehci_irq() with interrupts
enabled. Then later on, ehci_hrtimer_func() acquired the same lock
with interrupts disabled. This caused the lockdep violation (and it
eventually caused the system to hang).
The thing is, ehci_irq() is a non-threaded IRQ handler. It's _never_
supposed to run with interrupts enabled. As far as I can see, this
happened because irq_forced_thread_fn() did not disable interrupts
before calling the handler routine.
> > > > >> (Note: As far as I can tell, the commit itself is okay, but it
> > > > >> exposes
> > > > >> a bug somewhere else in the kernel.)
> > > > >>
> > > > >> If you revert that commit from 3.13, does it fix the problem?
> > > > >>
> > > > > Reverting the commit 88ed9fd50e57 solve the problem. Thank you so
> > > > > much.
> > > > Oops, I'll try to reproduce and investigate. Thanks for the
> > > > investigations!!!
> > > >
> > >
> > > I think the problem has maybe to do with the threadirqs kernel
> > > parameter.
> >
> > I don't think threaded irqs work very well with USB, can you try turning
> > that off and seeing if the issue goes away?
There's no reason in principle why the USB stack shouldn't work with
threaded IRQs, as far as I know.
> I use threaded irqs since more than two years without any problem. It
> works with OHCI, UHCI, EHCI and XHCI.
>
> This was the first time that an problem occurred.
I have no idea what might have changed between 3.12 and 3.13 to cause
this problem. Maybe Thomas can figure it out.
> And yes, the issues goes away when no thread irqs are used (with and
> without the patch).
Thomas, there must be some reason why the patch below is wrong, but I
don't know enough about the IRQ subsystem to tell what's really going
on. Can you explain it?
Alan Stern
Index: usb-3.14/kernel/irq/manage.c
===================================================================
--- usb-3.14.orig/kernel/irq/manage.c
+++ usb-3.14/kernel/irq/manage.c
@@ -777,9 +777,12 @@ static irqreturn_t
irq_forced_thread_fn(struct irq_desc *desc, struct irqaction *action)
{
irqreturn_t ret;
+ unsigned long flags;
local_bh_disable();
+ local_irq_save(flags);
ret = action->thread_fn(action->irq, action->dev_id);
+ local_irq_restore(flags);
irq_finalize_oneshot(desc, action);
local_bh_enable();
return ret;
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html