Re:[RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-28 Thread Dmitri Zolotov
~/linux-3.9.11$ grep -r TRSMRCY
drivers/usb/core/hub.c:/* TRSMRCY = 20 msec */
drivers/usb/core/hub.c:/* TRSMRCY = 10 msec */
drivers/usb/core/hcd.c: * unsuspended.  If they are then a TRSMRCY 
delay is needed
drivers/usb/core/hcd.c:usleep_range(1, 11000);/* 
TRSMRCY */
drivers/usb/host/isp1362-hcd.c:/* TRSMRCY */
drivers/usb/host/ohci-hub.c:/* TRSMRCY */
Do these files contain the same 
bug?N�r��yb�X��ǧv�^�)޺{.n�+{��^n�r���z���h����G���h�(�階�ݢj���m��z�ޖ���f���h���~�m�

Re: Re:[RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-28 Thread Alan Stern
On Wed, 28 Aug 2013, Dmitri Zolotov wrote:

 ~/linux-3.9.11$ grep -r TRSMRCY
 drivers/usb/core/hub.c:/* TRSMRCY = 20 msec */

Where did that come from?  In the 3.9.11 kernel this says 10, not 20.

 drivers/usb/core/hub.c:/* TRSMRCY = 10 msec */
 drivers/usb/core/hcd.c: * unsuspended.  If they are then a TRSMRCY 
 delay is needed
 drivers/usb/core/hcd.c:usleep_range(1, 11000);/* 
 TRSMRCY */
 drivers/usb/host/isp1362-hcd.c:/* TRSMRCY */
 drivers/usb/host/ohci-hub.c:/* TRSMRCY */
 Do these files contain the same bug?

As far as I can see, none of them contain any bugs related to TRSMRCY.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Sarah Sharp
On Thu, Aug 22, 2013 at 02:55:07PM -0700, Greg KH wrote:
 On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote:
  On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote:
   On Wed, 21 Aug 2013, Sarah Sharp wrote:
   
Possible fixes
--

The USB core obviously needs to be changed to check the port status
after the TRSMRCY timeout, and continue to wait if the port is still in
the resuming state.  I will have to study the EHCI port status diagrams
in detail to figure out how the USB core can do this.
   
   As far as EHCI is concerned, this is a non-problem.  The closest
   analogy to the RExit-U0 transition is in the description of the Force
   Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
   that the host controller must complete the transition to the high-speed
   idle state within 2 milliseconds of software setting the bit to a zero
   (which happens when the hub driver does its Get-Port-Status call).
   
   Thus, as soon as the TRSMRCY delay is finished, the device and the port
   are supposed to be ready.  In fact, the hardware doesn't provide any
   means of telling whether they are ready or not.
  
  Well, shoot, I thought I had solved world hunger, or at least USB power
  management issues. :)
  
  So basically it sounds like this is an xHCI specific issue, and probably
  not the root cause of the USB device disconnects we see under EHCI
  hosts.  I guess the xHCI hardware engineers just assumed software would
  always wait for the interrupt from the port status change event, rather
  than using a simple 10 ms timer.  I bet they didn't even realize that
  that the transition took longer than 10ms, because Windows waited for
  the port status change event.

One last thought on this note:  We know Windows doesn't have high-res
timers, and Arjan says asking for a 10 ms delay will produce a delay
around 17 ms.  Since Linux is busy waiting, we may be communicating with
the device sooner than Windows does.

 Why can't Linux do the same thing, and not worry about any timeout at
 all?

We can for xHCI hosts, but not EHCI hosts.  EHCI hosts only send an
interrupt when the suspend change bit is set, which is when software
needs to start the 10ms timer.  It doesn't send an interrupt when that
10ms expires, unlike xHCI.

We could add a new xHCI driver call, to resume the port, which would
only return to the USB core when the port status change event occurs
after the 10 ms (or longer) delay has finished.

  Should I print a debugging message if the xHCI host exceeds 10ms?  I
  would be nice to let hardware engineers know they're out of spec.
 
 Sure, but the odds of anyone of them enabling debugging, and then
 noticing this are slim to none.  But if it makes us feel better pointing
 out hardware bugs (I know it makes me feel good), please do so.

It will make it easier to convince xHCI host designers that they need to
improve their link training timing, so I'll leave it in.  Besides, it's
much more likely that people will have debugging enabled now that Xenia
has made the driver use dynamic debug rather than a config option. :)

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Alan Stern
On Mon, 26 Aug 2013, Sarah Sharp wrote:

 One last thought on this note:  We know Windows doesn't have high-res
 timers, and Arjan says asking for a 10 ms delay will produce a delay
 around 17 ms.  Since Linux is busy waiting, we may be communicating with
 the device sooner than Windows does.
 
  Why can't Linux do the same thing, and not worry about any timeout at
  all?
 
 We can for xHCI hosts, but not EHCI hosts.  EHCI hosts only send an
 interrupt when the suspend change bit is set, which is when software
 needs to start the 10ms timer.  It doesn't send an interrupt when that
 10ms expires, unlike xHCI.
 
 We could add a new xHCI driver call, to resume the port, which would
 only return to the USB core when the port status change event occurs
 after the 10 ms (or longer) delay has finished.

This sounds a little confused.  The end of the TRSMRCY period isn't
signalled by an interrupt on any host controller, not even on xHCI.  
It's just a regular kernel timer.  We could increase it from 10 ms to
17 (or 20), but only at the cost of increasing resume latency for every
USB device.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Sarah Sharp
On Fri, Aug 23, 2013 at 10:43:19AM -0400, Alan Stern wrote:
 On Wed, 21 Aug 2013, Sarah Sharp wrote:
 
  Background
  --
  
  The USB 2.0 specification, section 7.1.7.7, says that upon device remote
  wakeup signaling, the first active hub (which is often the roothub) must
  rebroadcast the resume signaling for at least 20 ms (TDRSMDN).  After
  that's done, the hub's suspend status change bit will be set, and system
  software must not access the device for at least 10 ms (TRSMRCY).
  
  It turns out that TRSMRCY is a *minimum*, not a *maximum*, according to
  Table 7-14.  That means the port can actually take longer than TRSMRCY
  to resume.  Any attempt to communicate with the device, or reset the
  device, will result in a USB device disconnect.
 
 By the way, I just noticed your Google+ posting about this.  I think 
 you (and perhaps the engineers you spoke with) may have misunderstood 
 what Table 7-14 means when it lists 10 ms as the _minimum_ value for 
 TRSMRCY.
 
 This delay value is a requirement on the OS.  The host system must not
 access the device until at least 10 ms after the resume is complete.  
 The system can wait longer than that if it wants -- that's why 10 ms is
 a minimum.  It just has to avoid accessing the device sooner.
 
 A _minimum_ value on the host side translates into a _maximum_ value on 
 the device side.  The device can safely assume that it can spend up to 
 10 ms getting back into shape after a resume, but no more.  After 10 
 ms, the host may try to communicate with it.

After re-reading the spec, I agree with your analysis.  However, the
fact that chipset designers misinterpreted the spec means there may be
hardware out there that needs a longer timeout.  The spec should have
been normative on both the software and the hardware, saying something
like:

The USB System Software must provide a 10 ms resume recovery time
(TRSMRCY) during which it will not attempt to access any device
connected to the affected (just-activated) bus segment.  The host
controller and device must be ready for communication after the resume
recovery time (TRSMRCY) expires.

I have heard reports of USB devices disconnecting from the bus and
reconnecting after remote wakeup.  I've personally experienced this with
one of my PL2303 USB serial adapters, although it has since died, so I
can't retest.

Another company (whose email I ironically lost due to a failed transfer
to a USB 3.0 backup drive) had bus traces showing the root cause of a
disconnect on resume from remote wakeup.  Occasionally, the host
controller was sending the SoFs too soon on resume, and the device would
interpret it as a low-speed chirp.  The device would disconnect, and
transform from a high speed device to a low speed device.  I don't
think increasing the 10 ms time out will help at all in this case, but
you did ask what USB device disconnect scenarios I've seen.

If users do see device disconnects on remote wakeup resume, we should
see if increasing the timeout helps.

  Then, when the USB core calls into get port status, it transitions the
  port from the Resume state to the RExit state by changing the port link
  state to U0.  The xHCI driver will get a port status change event when
  that transition is complete, but that port status change event is
  currently ignored.
 
 The excess delay you observe with xHCI is the time spent in the RExit
 substate?  That probably should not be counted as part of the TRSMRCY
 period.  It's hard to say for certain, because TRSMRCY is described
 only in the USB-2 spec and not in the xHCI spec, and vice versa for
 RExit.  Still, it's reasonable to assume that the TRSMRCY period should
 begin when the port changes back to U0, not when it leaves the RESUME
 state and enters RExit.
 
 So in the end this appears to be a simple bug in xhci-hcd.  The
 Get-Port-Status request that terminates the resume signalling should
 wait until the port goes back into U0 (which agrees with what you have
 already decided, of course).  ehci-hcd does something similar:
 
   /* stop resume signaling */
   temp = ~(PORT_RWC_BITS | PORT_SUSPEND | PORT_RESUME);
   ehci_writel(ehci, temp, status_reg);
   clear_bit(wIndex, ehci-resuming_ports);
   retval = ehci_handshake(ehci, status_reg,
   PORT_RESUME, 0, 2000 /* 2msec */);
 
 The ehci_handshake call busy-waits until the controller turns off the
 PORT_RESUME bit, which happens when the port has switched to a
 high-speed idle.  It's supposed to take no more than 2 ms but hopefully
 is a lot faster.  (Hmmm, maybe the private lock should be dropped
 during this handshake...)

Ah, so there is an analogous issue in EHCI.  Basically, the EHCI driver
waiting for the PORT_RESUME bit to be clear is equivalent to the xHCI
driver waiting for the port to enter U0.  I agree that this seems like
an xHCI driver issue, and I'll fix it in 

Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Alan Stern
On Mon, 26 Aug 2013, Sarah Sharp wrote:

  By the way, I just noticed your Google+ posting about this.  I think 
  you (and perhaps the engineers you spoke with) may have misunderstood 
  what Table 7-14 means when it lists 10 ms as the _minimum_ value for 
  TRSMRCY.
  
  This delay value is a requirement on the OS.  The host system must not
  access the device until at least 10 ms after the resume is complete.  
  The system can wait longer than that if it wants -- that's why 10 ms is
  a minimum.  It just has to avoid accessing the device sooner.
  
  A _minimum_ value on the host side translates into a _maximum_ value on 
  the device side.  The device can safely assume that it can spend up to 
  10 ms getting back into shape after a resume, but no more.  After 10 
  ms, the host may try to communicate with it.
 
 After re-reading the spec, I agree with your analysis.  However, the
 fact that chipset designers misinterpreted the spec means there may be
 hardware out there that needs a longer timeout.  The spec should have
 been normative on both the software and the hardware, saying something
 like:
 
 The USB System Software must provide a 10 ms resume recovery time
 (TRSMRCY) during which it will not attempt to access any device
 connected to the affected (just-activated) bus segment.  The host
 controller and device must be ready for communication after the resume
 recovery time (TRSMRCY) expires.

See, that's not quite right.  The host controller has to be ready for
communication long before TRSMRCY expires.  In fact, the host
controller has to start sending SOF packets within 3 ms of the end of
the resume signalling (to prevent the device from going back into
suspend).

It _would_ be correct to add that the device must be ready for
communication after TRSMRCY expires, but this should be understood
implicitly.  I think the spec is sufficiently clear in this regard.  
And it's not obvious (to me at least) that the chipset designers
misinterpreted it -- but then, I haven't talked to them directly.

 I have heard reports of USB devices disconnecting from the bus and
 reconnecting after remote wakeup.  I've personally experienced this with
 one of my PL2303 USB serial adapters, although it has since died, so I
 can't retest.

Such behavior wouldn't be surprising in the least.  It's the sort of 
thing that caused us to add the RESET_RESUME quirk.

 Another company (whose email I ironically lost due to a failed transfer
 to a USB 3.0 backup drive) had bus traces showing the root cause of a
 disconnect on resume from remote wakeup.  Occasionally, the host
 controller was sending the SoFs too soon on resume, and the device would
 interpret it as a low-speed chirp.

?  There is no such thing as a low-speed chirp.  Did you mean a 
high-speed chirp?

  The device would disconnect, and
 transform from a high speed device to a low speed device.  I don't
 think increasing the 10 ms time out will help at all in this case, but
 you did ask what USB device disconnect scenarios I've seen.

Yeah, that's a bad one.  There's not much we can do about the HC 
sending SOF packets too soon.

 If users do see device disconnects on remote wakeup resume, we should
 see if increasing the timeout helps.

Certainly.

 Ah, so there is an analogous issue in EHCI.  Basically, the EHCI driver
 waiting for the PORT_RESUME bit to be clear is equivalent to the xHCI
 driver waiting for the port to enter U0.  I agree that this seems like
 an xHCI driver issue, and I'll fix it in the driver.

Okay.  It may turn out that some devices still require TRSMRCY to be
longer than 10 ms, even after your fix.  Such behavior wouldn't be
specific to xHCI, though; it would apply equally well to EHCI
connections.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Greg KH
On Mon, Aug 26, 2013 at 10:45:30AM -0700, Sarah Sharp wrote:
 On Thu, Aug 22, 2013 at 02:55:07PM -0700, Greg KH wrote:
  On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote:
   On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote:
On Wed, 21 Aug 2013, Sarah Sharp wrote:

 Possible fixes
 --
 
 The USB core obviously needs to be changed to check the port status
 after the TRSMRCY timeout, and continue to wait if the port is still 
 in
 the resuming state.  I will have to study the EHCI port status 
 diagrams
 in detail to figure out how the USB core can do this.

As far as EHCI is concerned, this is a non-problem.  The closest
analogy to the RExit-U0 transition is in the description of the Force
Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
that the host controller must complete the transition to the high-speed
idle state within 2 milliseconds of software setting the bit to a zero
(which happens when the hub driver does its Get-Port-Status call).

Thus, as soon as the TRSMRCY delay is finished, the device and the port
are supposed to be ready.  In fact, the hardware doesn't provide any
means of telling whether they are ready or not.
   
   Well, shoot, I thought I had solved world hunger, or at least USB power
   management issues. :)
   
   So basically it sounds like this is an xHCI specific issue, and probably
   not the root cause of the USB device disconnects we see under EHCI
   hosts.  I guess the xHCI hardware engineers just assumed software would
   always wait for the interrupt from the port status change event, rather
   than using a simple 10 ms timer.  I bet they didn't even realize that
   that the transition took longer than 10ms, because Windows waited for
   the port status change event.
 
 One last thought on this note:  We know Windows doesn't have high-res
 timers, and Arjan says asking for a 10 ms delay will produce a delay
 around 17 ms.  Since Linux is busy waiting, we may be communicating with
 the device sooner than Windows does.

What does Windows do for this type of thing?  How long do they wait
before reporting an error?  We probably need to do the same as they do,
no matter what the spec says.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Sarah Sharp
On Mon, Aug 26, 2013 at 03:07:25PM -0700, Greg KH wrote:
 On Mon, Aug 26, 2013 at 10:45:30AM -0700, Sarah Sharp wrote:
  
  One last thought on this note:  We know Windows doesn't have high-res
  timers, and Arjan says asking for a 10 ms delay will produce a delay
  around 17 ms.  Since Linux is busy waiting, we may be communicating with
  the device sooner than Windows does.
 
 What does Windows do for this type of thing?  How long do they wait
 before reporting an error?  We probably need to do the same as they do,
 no matter what the spec says.

I double checked with the Intel Windows devs, and they say that Arjan is
mistaken, and they'll get around 11 ms, just like we do.

They say:

The EHCI driver stack applies the 10 ms starting at the point that it
sees C_PORT_SUSPEND asserted for the port.

The inaccuracy in Windows 8 is generally no more than 1 system clock
period late, but it can be delayed by interrupt processing for other
devices. I would expect the additional average delay to be no more than
1 ms.

They didn't mention waiting for the PORT_RESUME bit to be cleared, like
the Linux EHCI driver does.  Alan, would you like me to ask them about
that?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-26 Thread Alan Stern
On Mon, 26 Aug 2013, Sarah Sharp wrote:

 I double checked with the Intel Windows devs, and they say that Arjan is
 mistaken, and they'll get around 11 ms, just like we do.
 
 They say:
 
 The EHCI driver stack applies the 10 ms starting at the point that it
 sees C_PORT_SUSPEND asserted for the port.
 
 The inaccuracy in Windows 8 is generally no more than 1 system clock
 period late, but it can be delayed by interrupt processing for other
 devices. I would expect the additional average delay to be no more than
 1 ms.
 
 They didn't mention waiting for the PORT_RESUME bit to be cleared, like
 the Linux EHCI driver does.  Alan, would you like me to ask them about
 that?

Yes, please.  You might also ask them what they meant by it sees
C_PORT_SUSPEND asserted for the port, since there is no C_PORT_SUSPEND
bit in the EHCI Port Status and Control registers.  Were they talking 
about external hubs instead of the root hub?

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-23 Thread Alan Stern
On Wed, 21 Aug 2013, Sarah Sharp wrote:

 Background
 --
 
 The USB 2.0 specification, section 7.1.7.7, says that upon device remote
 wakeup signaling, the first active hub (which is often the roothub) must
 rebroadcast the resume signaling for at least 20 ms (TDRSMDN).  After
 that's done, the hub's suspend status change bit will be set, and system
 software must not access the device for at least 10 ms (TRSMRCY).
 
 It turns out that TRSMRCY is a *minimum*, not a *maximum*, according to
 Table 7-14.  That means the port can actually take longer than TRSMRCY
 to resume.  Any attempt to communicate with the device, or reset the
 device, will result in a USB device disconnect.

By the way, I just noticed your Google+ posting about this.  I think 
you (and perhaps the engineers you spoke with) may have misunderstood 
what Table 7-14 means when it lists 10 ms as the _minimum_ value for 
TRSMRCY.

This delay value is a requirement on the OS.  The host system must not
access the device until at least 10 ms after the resume is complete.  
The system can wait longer than that if it wants -- that's why 10 ms is
a minimum.  It just has to avoid accessing the device sooner.

A _minimum_ value on the host side translates into a _maximum_ value on 
the device side.  The device can safely assume that it can spend up to 
10 ms getting back into shape after a resume, but no more.  After 10 
ms, the host may try to communicate with it.

 Then, when the USB core calls into get port status, it transitions the
 port from the Resume state to the RExit state by changing the port link
 state to U0.  The xHCI driver will get a port status change event when
 that transition is complete, but that port status change event is
 currently ignored.

The excess delay you observe with xHCI is the time spent in the RExit
substate?  That probably should not be counted as part of the TRSMRCY
period.  It's hard to say for certain, because TRSMRCY is described
only in the USB-2 spec and not in the xHCI spec, and vice versa for
RExit.  Still, it's reasonable to assume that the TRSMRCY period should
begin when the port changes back to U0, not when it leaves the RESUME
state and enters RExit.

So in the end this appears to be a simple bug in xhci-hcd.  The
Get-Port-Status request that terminates the resume signalling should
wait until the port goes back into U0 (which agrees with what you have
already decided, of course).  ehci-hcd does something similar:

/* stop resume signaling */
temp = ~(PORT_RWC_BITS | PORT_SUSPEND | PORT_RESUME);
ehci_writel(ehci, temp, status_reg);
clear_bit(wIndex, ehci-resuming_ports);
retval = ehci_handshake(ehci, status_reg,
PORT_RESUME, 0, 2000 /* 2msec */);

The ehci_handshake call busy-waits until the controller turns off the
PORT_RESUME bit, which happens when the port has switched to a
high-speed idle.  It's supposed to take no more than 2 ms but hopefully
is a lot faster.  (Hmmm, maybe the private lock should be dropped
during this handshake...)

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-22 Thread Alan Stern
On Wed, 21 Aug 2013, Sarah Sharp wrote:

 Possible fixes
 --
 
 The USB core obviously needs to be changed to check the port status
 after the TRSMRCY timeout, and continue to wait if the port is still in
 the resuming state.  I will have to study the EHCI port status diagrams
 in detail to figure out how the USB core can do this.

As far as EHCI is concerned, this is a non-problem.  The closest
analogy to the RExit-U0 transition is in the description of the Force
Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
that the host controller must complete the transition to the high-speed
idle state within 2 milliseconds of software setting the bit to a zero
(which happens when the hub driver does its Get-Port-Status call).

Thus, as soon as the TRSMRCY delay is finished, the device and the port
are supposed to be ready.  In fact, the hardware doesn't provide any
means of telling whether they are ready or not.

  I can easily do
 this without the USB core being involved, by changing the xHCI driver to
 either:
 
 1. Busy wait with xhci_handshake() in the xHCI get port status until
the port is in U0.
 
 2. Add a completion per xHCI port.  In xHCI get port status, initiate
U0 entry, and wait on the port's completion for up to 20 ms.  In the
port status change event code, complete that port's completion when
the port is in U0 and the bus_state-resuming_ports bit is set.

I would expect either of those to be adequate.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-22 Thread Sarah Sharp
On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote:
 On Wed, 21 Aug 2013, Sarah Sharp wrote:
 
  Possible fixes
  --
  
  The USB core obviously needs to be changed to check the port status
  after the TRSMRCY timeout, and continue to wait if the port is still in
  the resuming state.  I will have to study the EHCI port status diagrams
  in detail to figure out how the USB core can do this.
 
 As far as EHCI is concerned, this is a non-problem.  The closest
 analogy to the RExit-U0 transition is in the description of the Force
 Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
 that the host controller must complete the transition to the high-speed
 idle state within 2 milliseconds of software setting the bit to a zero
 (which happens when the hub driver does its Get-Port-Status call).
 
 Thus, as soon as the TRSMRCY delay is finished, the device and the port
 are supposed to be ready.  In fact, the hardware doesn't provide any
 means of telling whether they are ready or not.

Well, shoot, I thought I had solved world hunger, or at least USB power
management issues. :)

So basically it sounds like this is an xHCI specific issue, and probably
not the root cause of the USB device disconnects we see under EHCI
hosts.  I guess the xHCI hardware engineers just assumed software would
always wait for the interrupt from the port status change event, rather
than using a simple 10 ms timer.  I bet they didn't even realize that
that the transition took longer than 10ms, because Windows waited for
the port status change event.

   I can easily do
  this without the USB core being involved, by changing the xHCI driver to
  either:
  
  1. Busy wait with xhci_handshake() in the xHCI get port status until
 the port is in U0.
  
  2. Add a completion per xHCI port.  In xHCI get port status, initiate
 U0 entry, and wait on the port's completion for up to 20 ms.  In the
 port status change event code, complete that port's completion when
 the port is in U0 and the bus_state-resuming_ports bit is set.
 
 I would expect either of those to be adequate.

I right, I think I'll do the busy wait, since 71% of the time it should
return immediately.  Completions are overkill here.

Should I print a debugging message if the xHCI host exceeds 10ms?  I
would be nice to let hardware engineers know they're out of spec.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-22 Thread Greg KH
On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote:
 On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote:
  On Wed, 21 Aug 2013, Sarah Sharp wrote:
  
   Possible fixes
   --
   
   The USB core obviously needs to be changed to check the port status
   after the TRSMRCY timeout, and continue to wait if the port is still in
   the resuming state.  I will have to study the EHCI port status diagrams
   in detail to figure out how the USB core can do this.
  
  As far as EHCI is concerned, this is a non-problem.  The closest
  analogy to the RExit-U0 transition is in the description of the Force
  Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
  that the host controller must complete the transition to the high-speed
  idle state within 2 milliseconds of software setting the bit to a zero
  (which happens when the hub driver does its Get-Port-Status call).
  
  Thus, as soon as the TRSMRCY delay is finished, the device and the port
  are supposed to be ready.  In fact, the hardware doesn't provide any
  means of telling whether they are ready or not.
 
 Well, shoot, I thought I had solved world hunger, or at least USB power
 management issues. :)
 
 So basically it sounds like this is an xHCI specific issue, and probably
 not the root cause of the USB device disconnects we see under EHCI
 hosts.  I guess the xHCI hardware engineers just assumed software would
 always wait for the interrupt from the port status change event, rather
 than using a simple 10 ms timer.  I bet they didn't even realize that
 that the transition took longer than 10ms, because Windows waited for
 the port status change event.

Why can't Linux do the same thing, and not worry about any timeout at
all?

 Should I print a debugging message if the xHCI host exceeds 10ms?  I
 would be nice to let hardware engineers know they're out of spec.

Sure, but the odds of anyone of them enabling debugging, and then
noticing this are slim to none.  But if it makes us feel better pointing
out hardware bugs (I know it makes me feel good), please do so.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT RFC] USB: Fix USB device disconnects on resume.

2013-08-22 Thread Alan Stern
On Thu, 22 Aug 2013, Sarah Sharp wrote:

  As far as EHCI is concerned, this is a non-problem.  The closest
  analogy to the RExit-U0 transition is in the description of the Force
  Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says
  that the host controller must complete the transition to the high-speed
  idle state within 2 milliseconds of software setting the bit to a zero
  (which happens when the hub driver does its Get-Port-Status call).
  
  Thus, as soon as the TRSMRCY delay is finished, the device and the port
  are supposed to be ready.  In fact, the hardware doesn't provide any
  means of telling whether they are ready or not.
 
 Well, shoot, I thought I had solved world hunger, or at least USB power
 management issues. :)
 
 So basically it sounds like this is an xHCI specific issue, and probably
 not the root cause of the USB device disconnects we see under EHCI
 hosts.

Probably not.  After all, USB-2 does not have the elaborate link
training and other link management features that USB-3 does.

What device disconnects do you see under EHCI?  I don't recall hearing 
about them.

  I guess the xHCI hardware engineers just assumed software would
 always wait for the interrupt from the port status change event, rather
 than using a simple 10 ms timer.  I bet they didn't even realize that
 that the transition took longer than 10ms, because Windows waited for
 the port status change event.

What do you mean?  EHCI doesn't have any port-status change event at
the end of the 10-ms TRSMRCY delay.  The only port-status change event
in the hardware is right at the beginning, when the resume is
initiated.  The same is true of UHCI, but OHCI is different.

I can easily do
   this without the USB core being involved, by changing the xHCI driver to
   either:
   
   1. Busy wait with xhci_handshake() in the xHCI get port status until
  the port is in U0.
   
   2. Add a completion per xHCI port.  In xHCI get port status, initiate
  U0 entry, and wait on the port's completion for up to 20 ms.  In the
  port status change event code, complete that port's completion when
  the port is in U0 and the bus_state-resuming_ports bit is set.
  
  I would expect either of those to be adequate.
 
 I right, I think I'll do the busy wait, since 71% of the time it should
 return immediately.  Completions are overkill here.

Although that will be adequate, it may not be ideal.  You'll still 
get a 10-ms TRSMRCY delay following the busy wait.  Unless you change 
the hub driver to eliminate that delay for xHCI host controllers.

 Should I print a debugging message if the xHCI host exceeds 10ms?  I
 would be nice to let hardware engineers know they're out of spec.

It can't hurt.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html