Re: Lockdep problem: v3.18+ (with yesterday's Linus tip - 6f51ee709e4c)

Greg Kroah-Hartman Thu, 18 Dec 2014 17:07:52 -0800

On Thu, Dec 18, 2014 at 02:40:48PM -0500, Alan Stern wrote:
> On Thu, 18 Dec 2014, Russell King - ARM Linux wrote:
> 
> > While unplugging a Logitek Keyboard/mouse micro-receiver, I got the
> > lockdep splat below.
> > 
> > However, I don't fully understand this splat - I see nothing in
> > flush_work() nor process_one_work() making use of "intf->reset_ws" -
> > which seems to be a USB thing.  I guess lockdep is being re-used to
> > validate work stuff, and "lock" is just plain misleading.
> 
> That sounds right.  intf->reset_ws is a work_struct for a reset 
> request.
> 
> > usb 2-1.1: USB disconnect, device number 3
> > 
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 3.18.0+ #1459 Not tainted
> > ---------------------------------------------
> > kworker/0:1/2758 is trying to acquire lock:
> >  ((&intf->reset_ws)){+.+.+.}, at: [<c003ba90>] flush_work+0x0/0x264
> > 
> > but task is already holding lock:
> >  ((&intf->reset_ws)){+.+.+.}, at: [<c003ca40>] process_one_work+0x130/0x4b4
> 
> I think what happened is that we called cancel_work_sync() for the 
> reset_ws embedded in one intf structure while running in the workqueue 
> routine for the reset_ws embedded in a different intf structure.
> 
> Assuming this interpretation is correct, I don't see how we can prevent 
> lockdep from making this mistake.  Here's the problem:
> 
>       An interface driver can queue a request for a reset.
> 
>       A reset can cause interface drivers to be unbound from their
>       interfaces.
> 
>       If an interface driver is unbound, any pending reset request 
>       that it queued has to be cancelled.  Otherwise the workqueue
>       would most likely end up carrying out the reset at a later time 
>       when nobody wants it.
> 
> (The code contains explicit protection for the case where the interface 
> being unbound is the one whose reset request is currently in progress.)
> 
> Now perhaps we don't need that last step.  We could allow a queued
> reset to take place even after the driver that requested it is gone.  
> Most of the time these reset requests occur in response to I/O errors
> when a USB device is unplugged -- as happened in this example -- in
> which case it doesn't matter what we do.
> 
> So even though it's not entirely pleasant, my inclination is to solve
> the problem by getting rid of usb_cancel_queued_reset() in
> drivers/usb/core/driver.c.
> 
> Comments, anybody?


That seems reasonable to me, unbinding when a reset is happening is
going to be a rare condition, but if we get rid of it, and we try to
queue a reset for a device that is gone, we will just fail the reset,
right?  If all should be fine, I have no objection to removing it.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Lockdep problem: v3.18+ (with yesterday's Linus tip - 6f51ee709e4c)

Reply via email to