I'll send out a revised patch later, thanks! It's also good this code got a more careful read. It seems like some things are not as obvious as I might like...
That patch will merge those list corruption fixes I sent, the "else" you verified was needed (ugh!!!), and some of what you include here. It won't add new BUG_ON calls (WARN at worst) or a duplicate ED state (see next).
>> ... add a new state >> ED_DESCHEDULED, which is treated exactly like ED_IDLE, except >> that in this state, the HC may still be referring to the ED in >> question. Thus, if
David.B> Sounds exactly like ED_UNLINK -- except maybe that it's not David.B> been put onto ed_rm_list (with ED_DEQUEUE set). Why add David.B> another state? ...
The current OHCI relies on the internals of the dma_pool() implementation. If the implementation changed to, say, modify the memory that is free or, heaven forbid, return the memory to the kernel, you'd get _extremely_ difficult to track down race conditions.
The current implementation _does_ poison memory on free, if slab poisoning is enabled. (That's why I asked if you were using it.)
And that's been quite handy for reporting list corruption bugs, from races or otherwise. But those list corruption bugs hit a blind spot in that code: it doesn't check for modify-after-free. Which is why those bugs were able to hide for so long!
It'd be good if you said _how_ you think it relies on such internals. Some of your debug diagnostics wrongly claimed allocation of "new" EDs when they were just being re-used. That'd make intentional/safe re-use look like a bug or race.
Even so, the code isn't race-free, like I explained yesterday:
- ed_alloc() clears the ED to 0 via memset() - the order in which memset() clears memory is undefined (various from platform to platform etc)
There's a wmb() before any ED is handed off to the OHCI silicon; that forces a defined order. Top of ed_schedule(). First use, or Nth re-use; no matter.
- thus you might get a case where hwTailP is 0 but hwHeadP is non-zero, which would cause the HC to happily start dereferencing the descriptor
If you assume a bug where the ED is freed but still in use, that's hardly the only thing that'd go wrong!! You can't use such a potential bug to prove something else is broken.
- Dave
------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel