Re: [linux-usb-devel] OHCI Power Management problems

Benjamin Herrenschmidt Fri, 22 Aug 2003 20:56:36 -0700

> This is with 2.6.0-test3-latest?  I don't think PM itself is
> currently expected to work ... much less any particular driver.


It is 2.6.0-test3-current-linus-bk ;) And yes, PM works for
PowerMacs in my tree and will work in Linus as soon as I have
pushed my pending driver updates. 

> There were a few small updates to that code last week.  But the
> last time I checked there were still PM bugs in the PCI code (Pavel
> had a not-ready-to-merge patch).  Plus, recent PM merges broke the
> APM support (it relies on the device_suspend routine, which no
> longer does anyting).  So every platform where I've been able
> to use power management is currently broken -- no fault of USB.

Patrick is currently merging a bunch of important PM updates that
should help for non-PPC platforms.
> 
> > The big problem with the current code is that it will still access
> > the chip after suspend. (Or during the suspend operation, it would
> > be better to properly stop all activities before writing to the
> > control register to suspend the chip).
> 
> But how does it access that?  There was a patch not long ago to
> prevent the root hub timers from doing anything during suspend.
> (I'm not sure that got merged.)  Other code _should_ be refusing
> to touch the hardware when it's suspended.

The root hub timer patch doesn't seem to have gotten in since I
saw calls to it (see my detailed comment below in my previous mail)
> 
> >  - The root hub timer. I hacked the get status function to return
> > 0 when ohci->sleeping is set, though that would need proper
> > synchronisation on SMP etc...That helped not locking up 100% of
> > the time during the suspend process, but I'm still having the
> > problem waking the machine up (memory corruption). I also
> > reject root hub control urbs
> 
> This sounds like what that root hub timer patch was doing.
> Actually the ohci->sleeping flag should go away; it's
> duplicating information in hcd->state.  (Same thing with
> the ohci->disabled flag.)

Ok. Looks like the root hub timer patch didn't get in then.

> >  - It seems that we don't accept to link urbs (ohci_urb_enqueue
> > returns with an error), however, ohci_urb_dequeue may do things
> > that end up touching the queue head registers. Those registers
> > should probably be shadowed some way.
> 
> I don't think shadowing will work very well, since when the
> controller comes out of suspend it will expect those registers
> to still be valid.  Better to prevent dequeue when suspended.

Well... what do we do if we get one ? We can't fail there,
can we ? We could point the chip to empty queues at suspend time,
that would allow us to shadow the real ones. Seems easier to me
than dealing with drivers or whatever other event causing a
dequeue... but that's maybe because I'm not too familiar with
the new driver.

> The last time this came up, Alan Stern suggested that USB
> drivers should unlink their URBs as part of suspend. 

I agree with him. But we can't expect all drivers to be fixed
and I prefer not crashing the host if a single driver is not
fixed...

> I'm not entirely sure I like that solution; we didn't need it in 2.4,
> so it "shouldn't" be needed now. 

2.4 worked sort of "magically" and wasn't very robust on PM if you
had devices plugged in.

> But there was also the
> comment that such changes should wait for some driver model
> changes to the suspend()/resume() calls.  (Which don't yet
> seem to have been merged.)

They have been now. Again, I beleive it's up to the device drivers
to stop sending URBs and reclaim pending ones during suspend (though
I don't think they need to dequeue endpoints). However, it's also up
to the OHCI driver to be robust enough to not let the chip die because
a driver didn't do that properly... I can help fixing at least some
individual driers in the future.

> >  - I haven't yet figured out other code path leading to touching
> > chip registers outside of the above, I'm not too sure about the
> > ED stuffs.
> 
> In normal operation, I'd only expect it from:  IRQs, the root hub
> timer, and urb enqueue/dequeue.  But things like checking the frame
> number (does any driver even use that call?) would too.

We can error it (or just return the last frame number) when suspending.
urb enqueue should fail imho (it could be implemented if we shadow
the queue registers but I prefer sending an error to a driver doing
the wrong thing), but doesn't it fail currently _after_ dealing
with the ED ? urb dequeue could be implemented with shadow registers
or be able to fail, but the later seem more complicated, don't you
think ?
> 
> > Any suggestion ? Right now, my fix to the root hub isn't enough
> > it seems but I haven't yet spotted what else is touching the chip.
> > 
> > The proper sequence of operations would probably be something like
> > (similar to what we have today) for S3 (and S2 if we ever use that)
> > 
> >  - Set ohci->sleeping
> 
> No, this duplicate state should vanish.

Ok.

> >  - Disable all IRQs but SF. From this point, no "link" operation is
> > accepted (URBs or EDs) and root hub stops polling. Unlinks are still
> > allowed
> >  - Wait a few frames so pending unlinks and done queue can be flushed
> 
> Implementing unlinks and processing completed TDs requires a few
> more IRQs enabled than that, though!  And all this shutdown should
> be managed through the driver model -- as much as necessary.
> 
> We could certainly arrange that urb submissions not be accepted
> for any device unless the driver model power_state allows.

Yup.
> 
> >  - Mark the chip as suspended. This could be setting ohci->sleeping
> > to 2 or using another flag. This one really means DO NOT TOUCH THE HW
> > again (we could have wrappers on read/writel catching bugs here
> 
> That's the intent of the current code, but it seems as if a root
> hub patch (usb/core/hcd.c) still needs merging.

Ok. Do you have this patch at hand so I can try it ?

> > eventually but that would be ugly). At this point, unlinks will be
> > done on shadow of the list head registers only. Make sure we sychronize
> > with a pending unlink on another CPU
> 
> We shouldn't need to modify the unlink logic.
> 
> >  - Really suspend the chip (write to control etc...)
> >  - Do the remaining bits ... (PCI PM mode, PowerMac specific shutdown
> > code etc....)
> > 
> > Do that seem correct to you ?
> 
> It doesn't seem like it uses the driver model suspend/resume logic,
> which is where a lot of this work should be done.  For example, all
> the children of the OHCI hardware (root hub, and devices connected
> to it) can do their cleanup in driver model suspend calls before
> the OHCI PM gets called ... and resumed after it comes back.

Sure, but the OHCI driver itself should be robust enough to deal
gracefully with incorrect drivers imho.

> > In the case of S4 (suspend-to-disk), I beleive we have no much choice
> > but doing a full unplug/replug of all devices though I'm not sure how
> > to trigger that.
> 
> That should be almost identical to D3cold resume.  Until the recent
> PM putbacks, that was working for me.  OHCI does have specific code
> already in place to notice that the controller comes back in reset,
> not suspend state -- and to cleanly disconnect all devices.  (Then,
> normal enumeration handles the rest.)

Ok.

Ben.


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines
at the same time. Free trial click here:http://www.vmware.com/wl/offer/358/0
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] OHCI Power Management problems

Reply via email to