On Fri, 6 Feb 2004, Stephen Hemminger wrote:

> Alan Stern wrote:
> 
> >I should have been more explicit.  I meant: Can you provide an explanation
> >of what caused your deadlock, more detailed than just "the list management
> >was crap"?  Can you point to exactly what in the code went wrong, and can
> >you provide a minimal patch to fix it, remaining in the context of the
> >current crappy lists and locks?
> >  
> >
> Sorry, I wasn't really expecting this version to get accepted as is.  
> Just start some exchange over
> what needs to be fixed.  I never got a "smoking gun" crash, that pointed 
> to exactly a race between
> two exact lines, but what I saw was urb private structures that were in 
> the remove list but had
> already been freed.  The problem with juggling three lists (urb's, 
> remove, and complete) with
> three seperate locks is that an dequeue operation can happen during the 
> middle of one of
> the steps of the uhci_irq processing.

Yeah, the current locking scheme is a mess.  But it's supposed to be 
correctly-functioning mess, and I'm really curious to know where the 
problem lies.  Without that, I would never be quite certain that changing 
the locking had solved the underlying error.

Did you mention that the oops could easily be recreated?  And did you
ever try out my suggestion of adding printk's to trace all the urbp
transitions among the various lists?

> How about something like:
>     - #ifdef  out the /proc debug  stuff, so other changes can break it 
> causing grief.

I don't have your patch with me here so I can't check the details.  Is it
possible to remove the part of the /proc stuff that prints the
complete_list while leaving the rest as is?  Would that be compatible with
your changes?  In other words, can you make a minimal change to 
uhci-debug.c so that most of it will continue to work while you make the 
other changes?

>     - put big lock in uhci and leave other spurflous locks in.

Okay.

>     - move complete_list out of  uhci

Again I need to check the details, but that doesn't seem to be necessary.  
It's (almost) true that the complete_list is only used within the dynamic
scope of uhci_irq(), so it doesn't need to be locked.  Since it's used in
a few subroutines, though, you would have to pass the list header as an
explicit argument if it's not in uhci, right?  You might as well just
leave the list header where it is; it doesn't take up much space.

I said "(almost)" above because complete_list is also used within the 
scope of uhci_stop().  To be totally correct you would have to create a 
list header in that routine as well.  However one of the changes I want to 
make is to move the calls to uhci_free_pending_qhs() etc. out of 
uhci_irq() and uhci_stop() into a common list-handling subroutine.

Another change I'd like to make is to move most of the interrupt
processing into some sort of bottom half.  When that's done the
complete_list may no longer be needed; URBs could be given back as soon as
the driver knew they are ready.

>     - remove spurflous locks one at a time.

That will work.  Take care to preserve the correct semantics.

Alan Stern



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to