Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-13 Thread Alan Stern
On Fri, 13 Jul 2007, Mike Nuss wrote: > I have the fix in place now, and it's working great. Is there anyone in > particular I should send it to for review? Which version of the kernel > should I diff against? The maintainer for ohci-hcd is Dave Brownell; I believe he is listed in the MAINTAINER

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-13 Thread Mike Nuss
Alan Stern wrote: > On Fri, 6 Jul 2007, Mike Nuss wrote: > >>> If all else fails and you can't get timely interrupts, you can always >>> fall back on a kernel timer. Scan through the data structures once a >>> second or thereabouts. As an optimization, allow the timer to run only >>> while th

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread David Brownell
On Friday 06 July 2007, Mike Nuss wrote: > > I'm very interested in the source of the existing quirk though. Do we > know who wrote that code, and where the insight came from? Did s/he > discover it experimentally or is there mention of the issue in the ZF > data book somewhere? I forget; check t

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Mike Nuss
David Brownell wrote: > On Thursday 05 July 2007, Mike Nuss wrote: >> I'm at a loss. It really looks like the HC just "skipped" a step for no >> reason. > > I forget ... did you already try removing that special case > at the top of the IRQ handler, where it checks the low bit > of the hcca->done_

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread David Brownell
On Friday 06 July 2007, Alan Stern wrote: > On Fri, 6 Jul 2007, Mike Nuss wrote: > > > All the other endpoints continue to function normally. Since NextED > > happens to be null in the cases I've seen, that means that it's on the > > end of its list and the other endpoints must be on other lists,

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread David Brownell
On Thursday 05 July 2007, Mike Nuss wrote: > Mike Nuss wrote: > > David Brownell wrote: > >> Hmm. Here's a theory. The way that the current code unlinks > >> an ED is to set the SKIP bit *AND* remove the ED from the relevant > >> part of the schedule. > >> > >> Maybe ... the hardware gets confuse

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Fri, 6 Jul 2007, Mike Nuss wrote: > > If all else fails and you can't get timely interrupts, you can always > > fall back on a kernel timer. Scan through the data structures once a > > second or thereabouts. As an optimization, allow the timer to run only > > while there are active URBs. >

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Mike Nuss
Alan Stern wrote: > On Fri, 6 Jul 2007, Mike Nuss wrote: > >> On the hardware side though, DI relates to how many frames the HC will >> wait until HccaDoneHead will be updated, and the WDH IRQ isn't sent >> until that happens. But since the TD never gets properly retired by the >> HC when this bug

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Fri, 6 Jul 2007, Mike Nuss wrote: > On the hardware side though, DI relates to how many frames the HC will > wait until HccaDoneHead will be updated, and the WDH IRQ isn't sent > until that happens. But since the TD never gets properly retired by the > HC when this bug occurs, we never get that

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Mike Nuss
Alan Stern wrote: > On Fri, 6 Jul 2007, Mike Nuss wrote: > >>> Here's what I meant. Normally SOF interrupts are disabled, but when >>> the driver gets a completion interrupt for an URB it could turn on the >>> SOF interrupt for the next frame only. Then when the next interrupt >>> occurs the

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Fri, 6 Jul 2007, Mike Nuss wrote: > > Here's what I meant. Normally SOF interrupts are disabled, but when > > the driver gets a completion interrupt for an URB it could turn on the > > SOF interrupt for the next frame only. Then when the next interrupt > > occurs the driver could compare t

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Mike Nuss
Alan Stern wrote: > On Thu, 5 Jul 2007, Mike Nuss wrote: > > There are three time periods in question. > > > > A = before there is any problem > > > > B = a read seems to have completed, the HC has advanced HeadP, but > > failed to put the completed TD on the donelist. At this point, > > something

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Fri, 6 Jul 2007, Mike Nuss wrote: > Alan Stern wrote: > > AFAIK there are only two lists (apart from the "done" list): the > > periodic list and the async list. The HC is allowed and expected to > > jump between them during the course of a single frame. > > This should be very easy to test, s

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Mike Nuss
Alan Stern wrote: > AFAIK there are only two lists (apart from the "done" list): the > periodic list and the async list. The HC is allowed and expected to > jump between them during the course of a single frame. This should be very easy to test, so I'll make it the next thing I check. > It could

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Fri, 6 Jul 2007, Mike Nuss wrote: > Alan Stern wrote: > > On Thu, 5 Jul 2007, Mike Nuss wrote: > > > >> There are three time periods in question. > >> > >> A = before there is any problem > >> > >> B = a read seems to have completed, the HC has advanced HeadP, but > >> failed to put the comple

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-06 Thread Alan Stern
On Thu, 5 Jul 2007, Mike Nuss wrote: > There are three time periods in question. > > A = before there is any problem > > B = a read seems to have completed, the HC has advanced HeadP, but > failed to put the completed TD on the donelist. At this point, something > is seriously wrong, but ohci-hc

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-05 Thread David Brownell
On Tuesday 03 July 2007, Mike Nuss wrote: > David Brownell wrote: > > Hmm. Here's a theory. The way that the current code unlinks > > an ED is to set the SKIP bit *AND* remove the ED from the relevant > > part of the schedule. > > > > Maybe ... the hardware gets confused when the ED doesn't seem

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-05 Thread Mike Nuss
Mike Nuss wrote: > David Brownell wrote: >> Hmm. Here's a theory. The way that the current code unlinks >> an ED is to set the SKIP bit *AND* remove the ED from the relevant >> part of the schedule. >> >> Maybe ... the hardware gets confused when the ED doesn't seem >> to be on the relevant list.

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-03 Thread Mike Nuss
David Brownell wrote: > Hmm. Here's a theory. The way that the current code unlinks > an ED is to set the SKIP bit *AND* remove the ED from the relevant > part of the schedule. > > Maybe ... the hardware gets confused when the ED doesn't seem > to be on the relevant list. Like maybe it expects

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-03 Thread David Brownell
On Tuesday 03 July 2007, Mike Nuss wrote: > David Brownell wrote: > > On Wednesday 02 May 2007, Mike Nuss wrote: > > > > It's possible that the SKIP bit isn't handled correctly. > > > > I've not looked at that code in some time, but I seem to > > recall thinking that setting SKIP was an action more

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-03 Thread Mike Nuss
David Brownell wrote: > On Wednesday 02 May 2007, Mike Nuss wrote: > > It's possible that the SKIP bit isn't handled correctly. > > I've not looked at that code in some time, but I seem to > recall thinking that setting SKIP was an action more in > the "defensive paranoia" category than the "essent

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-07-03 Thread David Brownell
On Wednesday 02 May 2007, Mike Nuss wrote: > Mike Nuss wrote: > > David Brownell wrote: Cleaning up some of my inbox here ... :) > >> So, trying for some (bad) ASCII art here > >> > >> TDs 1-4 submitted to ED, > >> HC completed a few (say, 1 & 2) > >> > >> ed.tail --

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-08 Thread David Brownell
On Tuesday 08 May 2007, Alan Stern wrote: > > It's not allowed to be active at that point. The flag being > > tested is set only by usb_clear_halt() > > More accurately, the flag being tested (i.e., the toggle value) is > _cleared_ only by usb_clear_halt() and usb_enable_endpoint(). ohci-hcd

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-08 Thread Alan Stern
On Tue, 8 May 2007, David Brownell wrote: > > > > It's probably nothing, but this line in ohci-q.c in td_submit_urb raised > > > an eyebrow: > > > > > > urb_priv->ed->hwHeadP &= ~cpu_to_hc32 (ohci, ED_C); > > > > > > There is a comment "resetting toggle is meaningless if the endpoint is > > > a

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-08 Thread David Brownell
> > It's probably nothing, but this line in ohci-q.c in td_submit_urb raised > > an eyebrow: > > > > urb_priv->ed->hwHeadP &= ~cpu_to_hc32 (ohci, ED_C); > > > > There is a comment "resetting toggle is meaningless if the endpoint is > > active" above this code. While that's true, I wouldn't expec

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-08 Thread Alan Stern
On Mon, 7 May 2007, Mike Nuss wrote: > Alan Stern wrote: > > > > > > Could be the driver updated the pointers but did so at the wrong > time, > > > with the result that the controller overwrote them with older > values. > > > Or maybe just one of them was overwritten. Would that explain your > >

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-07 Thread Mike Nuss
Alan Stern wrote: > > > > Could be the driver updated the pointers but did so at the wrong time, > > with the result that the controller overwrote them with older values. > > Or maybe just one of them was overwritten. Would that explain your > > observations? It's probably nothing, but this line

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-07 Thread Mike Nuss
Alan Stern wrote: > > > > To enqueue a new TD, the HCD fills the old dummy TD and appends a new TD > > to the end of the list. It should then update TailP. It *looks* like the > > HCD appended a TD to the queue next but didn't update ed->dummy or > > TailP. > > Could be the driver updated the poin

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-05 Thread Alan Stern
On Fri, 4 May 2007, Mike Nuss wrote: > Further support (I dumped the 'dummy' td before freeing it). This is a > new trace so the addresses won't match the last trace. > > kernel: ohci_hcd :00:13.0: leak ed c3e8e680 (#82) state 0 (has tds) > kernel: ohci_hcd :00:13.0: ed->dummy td c3e8d8c0

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-05 Thread Alan Stern
On Fri, 4 May 2007, Mike Nuss wrote: > As I understand it, the HC will not do any queue processing while HeadP > and TailP are equal (which is the case here; they are both pointing to > dummy). This explains why nothing is happening - the HC will never do > anything more with this endpoint because

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-04 Thread Mike Nuss
Some further support below. > > > As I understand it, the HC will not do any queue processing while HeadP > and TailP are equal (which is the case here; they are both pointing to > dummy). This explains why nothing is happening - the HC will never do > anything more with this endpoint because it

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-04 Thread Mike Nuss
> To enqueue a new TD, the HCD fills the old dummy TD and appends a new TD > to the end of the list. It should then update TailP. It *looks* like the > HCD appended a TD to the queue next but didn't update ed->dummy or > TailP. (scratch the word "next" from the last sentence.) ---

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-04 Thread Mike Nuss
David Brownell wrote: > >> So, trying for some (bad) ASCII art here > >> > >> TDs 1-4 submitted to ED, > >> HC completed a few (say, 1 & 2) > >> > >> ed.tail ---+ > >> ed.head --+I > >> II

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-02 Thread Mike Nuss
Mike Nuss wrote: > David Brownell wrote: >> So, trying for some (bad) ASCII art here >> >> TDs 1-4 submitted to ED, >> HC completed a few (say, 1 & 2) >> >> ed.tail ---+ >> ed.head --+I >> I

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-05-02 Thread Mike Nuss
David Brownell wrote: > So, trying for some (bad) ASCII art here > > TDs 1-4 submitted to ED, > HC completed a few (say, 1 & 2) > > ed.tail ---+ > ed.head --+I > II >

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-30 Thread David Brownell
On Monday 30 April 2007, Mike Nuss wrote: > Alan Stern wrote: > > > I think it's pretty clear there's an underlying hardware issue, but > > > maybe there is a workaround. I added the additional debug code you > > > suggested to get a better idea of what finish_unlinks() is doing (or > not > > > do

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-30 Thread Mike Nuss
David Brownell wrote: > > Yeah. Well, at some level once you notice that INTR_SF lossage, > I thik the only possible recovery is to reset the controller and > then restart everything from scratch (re-enumerate etc). Resetting > should be easy; so maybe all you'd need to do is call ohci_restart()

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-30 Thread Mike Nuss
Alan Stern wrote: > > I think it's pretty clear there's an underlying hardware issue, but > > maybe there is a workaround. I added the additional debug code you > > suggested to get a better idea of what finish_unlinks() is doing (or not > > doing). > > And the results are... ? finish_unlinks is

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-28 Thread David Brownell
On Saturday 28 April 2007, Alan Stern wrote: > On Fri, 27 Apr 2007, Mike Nuss wrote: > > > So for smaller values of HZ, it would retry for > > longer. Was this the intended behavior? > > You're right; for some reason I was thinking that the argument to > schedule_timeout_interruptib

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-28 Thread David Brownell
On Friday 27 April 2007, Mike Nuss wrote: > > In our environment I can reproduce it a couple of times a day by sending > a ton of data from our devices. A device will seem to stop sending data > (this is probably the start of the "bad condition" where INTR_SF never > happens), and if I then unplug

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-28 Thread Alan Stern
On Sat, 28 Apr 2007, Mike Nuss wrote: > > This has nothing to do with polling or bInterval. The host controller is > > supposed to generate an interrupt at the start of each frame (once per > > millisecond) provided the appropriate IRQ-enable flag is set. INTR_SF == > > INTeRrupt at Start of Fra

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-28 Thread Alan Stern
On Fri, 27 Apr 2007, Mike Nuss wrote: > On Friday 27 April 2007, David Brownell wrote: > > On Friday 27 April 2007, Mike Nuss wrote: > > > Sometimes upon removing one of our devices (for which we have a > custom > > USB driver), OHCI fails > > > to free all the associated resources with the devic

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-28 Thread Alan Stern
On Fri, 27 Apr 2007, Mike Nuss wrote: > > Look at ohci-hcd.c:ohci_endpoint_disable(). There's a > "switch(ed->state)" > > line; see what it does in the ED_UNLINK case. It retries for up to > 1000 > > ms, waiting for ed->state to change to ED_IDLE. That's where the "IRQ > > INTR_SF lossage" mess

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-27 Thread Mike Nuss
On Friday 27 April 2007, David Brownell wrote: > On Friday 27 April 2007, Mike Nuss wrote: > > Sometimes upon removing one of our devices (for which we have a custom > USB driver), OHCI fails > > to free all the associated resources with the device. The problem is > always associated > > with the

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-27 Thread David Brownell
On Friday 27 April 2007, Mike Nuss wrote: > Sometimes upon removing one of our devices (for which we have a custom USB > driver), OHCI fails > to free all the associated resources with the device. The problem is always > associated > with the "IRQ INTR_SF lossage" message, which I assume is proba

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-27 Thread Mike Nuss
> From: Alan Stern [mailto:[EMAIL PROTECTED] > > Please configure your email client to wrap lines after 75 columns or so. Sorry about that. > Look at ohci-hcd.c:ohci_endpoint_disable(). There's a "switch(ed->state)" > line; see what it does in the ED_UNLINK case. It retries for up to 1000 > ms

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-27 Thread Alan Stern
Please configure your email client to wrap lines after 75 columns or so. On Fri, 27 Apr 2007, Mike Nuss wrote: > Sometimes upon removing one of our devices (for which we have a custom > USB driver), OHCI fails to free all the associated resources with the > device. The problem is always associate

Re: [linux-usb-devel] OHCI hangs after failing to free resources

2007-04-27 Thread Mike Nuss
> Anyways, I realized that our disconnect() method was never getting called > in these cases, and it's because there is a lockup in > usb_hcd_endpoint_disable.  Our device has two endpoints (plus control) - > one for reads, and one for writes.  When we hit this condition, it always > hangs while di