On Thu, Dec 18, 2014 at 02:40:48PM -0500, Alan Stern wrote:
> On Thu, 18 Dec 2014, Russell King - ARM Linux wrote:
>
> > While unplugging a Logitek Keyboard/mouse micro-receiver, I got the
> > lockdep splat below.
> >
> > However, I don't fully understand this splat - I see nothing in
> > flush_work() nor process_one_work() making use of "intf->reset_ws" -
> > which seems to be a USB thing. I guess lockdep is being re-used to
> > validate work stuff, and "lock" is just plain misleading.
>
> That sounds right. intf->reset_ws is a work_struct for a reset
> request.
>
> > usb 2-1.1: USB disconnect, device number 3
> >
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 3.18.0+ #1459 Not tainted
> > ---------------------------------------------
> > kworker/0:1/2758 is trying to acquire lock:
> > ((&intf->reset_ws)){+.+.+.}, at: [<c003ba90>] flush_work+0x0/0x264
> >
> > but task is already holding lock:
> > ((&intf->reset_ws)){+.+.+.}, at: [<c003ca40>] process_one_work+0x130/0x4b4
>
> I think what happened is that we called cancel_work_sync() for the
> reset_ws embedded in one intf structure while running in the workqueue
> routine for the reset_ws embedded in a different intf structure.
>
> Assuming this interpretation is correct, I don't see how we can prevent
> lockdep from making this mistake. Here's the problem:
>
> An interface driver can queue a request for a reset.
>
> A reset can cause interface drivers to be unbound from their
> interfaces.
>
> If an interface driver is unbound, any pending reset request
> that it queued has to be cancelled. Otherwise the workqueue
> would most likely end up carrying out the reset at a later time
> when nobody wants it.
>
> (The code contains explicit protection for the case where the interface
> being unbound is the one whose reset request is currently in progress.)
>
> Now perhaps we don't need that last step. We could allow a queued
> reset to take place even after the driver that requested it is gone.
> Most of the time these reset requests occur in response to I/O errors
> when a USB device is unplugged -- as happened in this example -- in
> which case it doesn't matter what we do.
>
> So even though it's not entirely pleasant, my inclination is to solve
> the problem by getting rid of usb_cancel_queued_reset() in
> drivers/usb/core/driver.c.
>
> Comments, anybody?
That seems reasonable to me, unbinding when a reset is happening is
going to be a rare condition, but if we get rid of it, and we try to
queue a reset for a device that is gone, we will just fail the reset,
right? If all should be fine, I have no objection to removing it.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html