Re:[RFT RFC] USB: Fix USB device disconnects on resume.
~/linux-3.9.11$ grep -r TRSMRCY drivers/usb/core/hub.c:/* TRSMRCY = 20 msec */ drivers/usb/core/hub.c:/* TRSMRCY = 10 msec */ drivers/usb/core/hcd.c: * unsuspended. If they are then a TRSMRCY delay is needed drivers/usb/core/hcd.c:usleep_range(1, 11000);/* TRSMRCY */ drivers/usb/host/isp1362-hcd.c:/* TRSMRCY */ drivers/usb/host/ohci-hub.c:/* TRSMRCY */ Do these files contain the same bug?N�r��yb�X��ǧv�^�){.n�+{��^n�r���z���h����G���h�(�階�ݢj���m��z�ޖ���f���h���~�m�
Re: Re:[RFT RFC] USB: Fix USB device disconnects on resume.
On Wed, 28 Aug 2013, Dmitri Zolotov wrote: ~/linux-3.9.11$ grep -r TRSMRCY drivers/usb/core/hub.c:/* TRSMRCY = 20 msec */ Where did that come from? In the 3.9.11 kernel this says 10, not 20. drivers/usb/core/hub.c:/* TRSMRCY = 10 msec */ drivers/usb/core/hcd.c: * unsuspended. If they are then a TRSMRCY delay is needed drivers/usb/core/hcd.c:usleep_range(1, 11000);/* TRSMRCY */ drivers/usb/host/isp1362-hcd.c:/* TRSMRCY */ drivers/usb/host/ohci-hub.c:/* TRSMRCY */ Do these files contain the same bug? As far as I can see, none of them contain any bugs related to TRSMRCY. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Thu, Aug 22, 2013 at 02:55:07PM -0700, Greg KH wrote: On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote: On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote: On Wed, 21 Aug 2013, Sarah Sharp wrote: Possible fixes -- The USB core obviously needs to be changed to check the port status after the TRSMRCY timeout, and continue to wait if the port is still in the resuming state. I will have to study the EHCI port status diagrams in detail to figure out how the USB core can do this. As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. Well, shoot, I thought I had solved world hunger, or at least USB power management issues. :) So basically it sounds like this is an xHCI specific issue, and probably not the root cause of the USB device disconnects we see under EHCI hosts. I guess the xHCI hardware engineers just assumed software would always wait for the interrupt from the port status change event, rather than using a simple 10 ms timer. I bet they didn't even realize that that the transition took longer than 10ms, because Windows waited for the port status change event. One last thought on this note: We know Windows doesn't have high-res timers, and Arjan says asking for a 10 ms delay will produce a delay around 17 ms. Since Linux is busy waiting, we may be communicating with the device sooner than Windows does. Why can't Linux do the same thing, and not worry about any timeout at all? We can for xHCI hosts, but not EHCI hosts. EHCI hosts only send an interrupt when the suspend change bit is set, which is when software needs to start the 10ms timer. It doesn't send an interrupt when that 10ms expires, unlike xHCI. We could add a new xHCI driver call, to resume the port, which would only return to the USB core when the port status change event occurs after the 10 ms (or longer) delay has finished. Should I print a debugging message if the xHCI host exceeds 10ms? I would be nice to let hardware engineers know they're out of spec. Sure, but the odds of anyone of them enabling debugging, and then noticing this are slim to none. But if it makes us feel better pointing out hardware bugs (I know it makes me feel good), please do so. It will make it easier to convince xHCI host designers that they need to improve their link training timing, so I'll leave it in. Besides, it's much more likely that people will have debugging enabled now that Xenia has made the driver use dynamic debug rather than a config option. :) Sarah Sharp -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Mon, 26 Aug 2013, Sarah Sharp wrote: One last thought on this note: We know Windows doesn't have high-res timers, and Arjan says asking for a 10 ms delay will produce a delay around 17 ms. Since Linux is busy waiting, we may be communicating with the device sooner than Windows does. Why can't Linux do the same thing, and not worry about any timeout at all? We can for xHCI hosts, but not EHCI hosts. EHCI hosts only send an interrupt when the suspend change bit is set, which is when software needs to start the 10ms timer. It doesn't send an interrupt when that 10ms expires, unlike xHCI. We could add a new xHCI driver call, to resume the port, which would only return to the USB core when the port status change event occurs after the 10 ms (or longer) delay has finished. This sounds a little confused. The end of the TRSMRCY period isn't signalled by an interrupt on any host controller, not even on xHCI. It's just a regular kernel timer. We could increase it from 10 ms to 17 (or 20), but only at the cost of increasing resume latency for every USB device. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Fri, Aug 23, 2013 at 10:43:19AM -0400, Alan Stern wrote: On Wed, 21 Aug 2013, Sarah Sharp wrote: Background -- The USB 2.0 specification, section 7.1.7.7, says that upon device remote wakeup signaling, the first active hub (which is often the roothub) must rebroadcast the resume signaling for at least 20 ms (TDRSMDN). After that's done, the hub's suspend status change bit will be set, and system software must not access the device for at least 10 ms (TRSMRCY). It turns out that TRSMRCY is a *minimum*, not a *maximum*, according to Table 7-14. That means the port can actually take longer than TRSMRCY to resume. Any attempt to communicate with the device, or reset the device, will result in a USB device disconnect. By the way, I just noticed your Google+ posting about this. I think you (and perhaps the engineers you spoke with) may have misunderstood what Table 7-14 means when it lists 10 ms as the _minimum_ value for TRSMRCY. This delay value is a requirement on the OS. The host system must not access the device until at least 10 ms after the resume is complete. The system can wait longer than that if it wants -- that's why 10 ms is a minimum. It just has to avoid accessing the device sooner. A _minimum_ value on the host side translates into a _maximum_ value on the device side. The device can safely assume that it can spend up to 10 ms getting back into shape after a resume, but no more. After 10 ms, the host may try to communicate with it. After re-reading the spec, I agree with your analysis. However, the fact that chipset designers misinterpreted the spec means there may be hardware out there that needs a longer timeout. The spec should have been normative on both the software and the hardware, saying something like: The USB System Software must provide a 10 ms resume recovery time (TRSMRCY) during which it will not attempt to access any device connected to the affected (just-activated) bus segment. The host controller and device must be ready for communication after the resume recovery time (TRSMRCY) expires. I have heard reports of USB devices disconnecting from the bus and reconnecting after remote wakeup. I've personally experienced this with one of my PL2303 USB serial adapters, although it has since died, so I can't retest. Another company (whose email I ironically lost due to a failed transfer to a USB 3.0 backup drive) had bus traces showing the root cause of a disconnect on resume from remote wakeup. Occasionally, the host controller was sending the SoFs too soon on resume, and the device would interpret it as a low-speed chirp. The device would disconnect, and transform from a high speed device to a low speed device. I don't think increasing the 10 ms time out will help at all in this case, but you did ask what USB device disconnect scenarios I've seen. If users do see device disconnects on remote wakeup resume, we should see if increasing the timeout helps. Then, when the USB core calls into get port status, it transitions the port from the Resume state to the RExit state by changing the port link state to U0. The xHCI driver will get a port status change event when that transition is complete, but that port status change event is currently ignored. The excess delay you observe with xHCI is the time spent in the RExit substate? That probably should not be counted as part of the TRSMRCY period. It's hard to say for certain, because TRSMRCY is described only in the USB-2 spec and not in the xHCI spec, and vice versa for RExit. Still, it's reasonable to assume that the TRSMRCY period should begin when the port changes back to U0, not when it leaves the RESUME state and enters RExit. So in the end this appears to be a simple bug in xhci-hcd. The Get-Port-Status request that terminates the resume signalling should wait until the port goes back into U0 (which agrees with what you have already decided, of course). ehci-hcd does something similar: /* stop resume signaling */ temp = ~(PORT_RWC_BITS | PORT_SUSPEND | PORT_RESUME); ehci_writel(ehci, temp, status_reg); clear_bit(wIndex, ehci-resuming_ports); retval = ehci_handshake(ehci, status_reg, PORT_RESUME, 0, 2000 /* 2msec */); The ehci_handshake call busy-waits until the controller turns off the PORT_RESUME bit, which happens when the port has switched to a high-speed idle. It's supposed to take no more than 2 ms but hopefully is a lot faster. (Hmmm, maybe the private lock should be dropped during this handshake...) Ah, so there is an analogous issue in EHCI. Basically, the EHCI driver waiting for the PORT_RESUME bit to be clear is equivalent to the xHCI driver waiting for the port to enter U0. I agree that this seems like an xHCI driver issue, and I'll fix it in
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Mon, 26 Aug 2013, Sarah Sharp wrote: By the way, I just noticed your Google+ posting about this. I think you (and perhaps the engineers you spoke with) may have misunderstood what Table 7-14 means when it lists 10 ms as the _minimum_ value for TRSMRCY. This delay value is a requirement on the OS. The host system must not access the device until at least 10 ms after the resume is complete. The system can wait longer than that if it wants -- that's why 10 ms is a minimum. It just has to avoid accessing the device sooner. A _minimum_ value on the host side translates into a _maximum_ value on the device side. The device can safely assume that it can spend up to 10 ms getting back into shape after a resume, but no more. After 10 ms, the host may try to communicate with it. After re-reading the spec, I agree with your analysis. However, the fact that chipset designers misinterpreted the spec means there may be hardware out there that needs a longer timeout. The spec should have been normative on both the software and the hardware, saying something like: The USB System Software must provide a 10 ms resume recovery time (TRSMRCY) during which it will not attempt to access any device connected to the affected (just-activated) bus segment. The host controller and device must be ready for communication after the resume recovery time (TRSMRCY) expires. See, that's not quite right. The host controller has to be ready for communication long before TRSMRCY expires. In fact, the host controller has to start sending SOF packets within 3 ms of the end of the resume signalling (to prevent the device from going back into suspend). It _would_ be correct to add that the device must be ready for communication after TRSMRCY expires, but this should be understood implicitly. I think the spec is sufficiently clear in this regard. And it's not obvious (to me at least) that the chipset designers misinterpreted it -- but then, I haven't talked to them directly. I have heard reports of USB devices disconnecting from the bus and reconnecting after remote wakeup. I've personally experienced this with one of my PL2303 USB serial adapters, although it has since died, so I can't retest. Such behavior wouldn't be surprising in the least. It's the sort of thing that caused us to add the RESET_RESUME quirk. Another company (whose email I ironically lost due to a failed transfer to a USB 3.0 backup drive) had bus traces showing the root cause of a disconnect on resume from remote wakeup. Occasionally, the host controller was sending the SoFs too soon on resume, and the device would interpret it as a low-speed chirp. ? There is no such thing as a low-speed chirp. Did you mean a high-speed chirp? The device would disconnect, and transform from a high speed device to a low speed device. I don't think increasing the 10 ms time out will help at all in this case, but you did ask what USB device disconnect scenarios I've seen. Yeah, that's a bad one. There's not much we can do about the HC sending SOF packets too soon. If users do see device disconnects on remote wakeup resume, we should see if increasing the timeout helps. Certainly. Ah, so there is an analogous issue in EHCI. Basically, the EHCI driver waiting for the PORT_RESUME bit to be clear is equivalent to the xHCI driver waiting for the port to enter U0. I agree that this seems like an xHCI driver issue, and I'll fix it in the driver. Okay. It may turn out that some devices still require TRSMRCY to be longer than 10 ms, even after your fix. Such behavior wouldn't be specific to xHCI, though; it would apply equally well to EHCI connections. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Mon, Aug 26, 2013 at 10:45:30AM -0700, Sarah Sharp wrote: On Thu, Aug 22, 2013 at 02:55:07PM -0700, Greg KH wrote: On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote: On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote: On Wed, 21 Aug 2013, Sarah Sharp wrote: Possible fixes -- The USB core obviously needs to be changed to check the port status after the TRSMRCY timeout, and continue to wait if the port is still in the resuming state. I will have to study the EHCI port status diagrams in detail to figure out how the USB core can do this. As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. Well, shoot, I thought I had solved world hunger, or at least USB power management issues. :) So basically it sounds like this is an xHCI specific issue, and probably not the root cause of the USB device disconnects we see under EHCI hosts. I guess the xHCI hardware engineers just assumed software would always wait for the interrupt from the port status change event, rather than using a simple 10 ms timer. I bet they didn't even realize that that the transition took longer than 10ms, because Windows waited for the port status change event. One last thought on this note: We know Windows doesn't have high-res timers, and Arjan says asking for a 10 ms delay will produce a delay around 17 ms. Since Linux is busy waiting, we may be communicating with the device sooner than Windows does. What does Windows do for this type of thing? How long do they wait before reporting an error? We probably need to do the same as they do, no matter what the spec says. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Mon, Aug 26, 2013 at 03:07:25PM -0700, Greg KH wrote: On Mon, Aug 26, 2013 at 10:45:30AM -0700, Sarah Sharp wrote: One last thought on this note: We know Windows doesn't have high-res timers, and Arjan says asking for a 10 ms delay will produce a delay around 17 ms. Since Linux is busy waiting, we may be communicating with the device sooner than Windows does. What does Windows do for this type of thing? How long do they wait before reporting an error? We probably need to do the same as they do, no matter what the spec says. I double checked with the Intel Windows devs, and they say that Arjan is mistaken, and they'll get around 11 ms, just like we do. They say: The EHCI driver stack applies the 10 ms starting at the point that it sees C_PORT_SUSPEND asserted for the port. The inaccuracy in Windows 8 is generally no more than 1 system clock period late, but it can be delayed by interrupt processing for other devices. I would expect the additional average delay to be no more than 1 ms. They didn't mention waiting for the PORT_RESUME bit to be cleared, like the Linux EHCI driver does. Alan, would you like me to ask them about that? Sarah Sharp -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Mon, 26 Aug 2013, Sarah Sharp wrote: I double checked with the Intel Windows devs, and they say that Arjan is mistaken, and they'll get around 11 ms, just like we do. They say: The EHCI driver stack applies the 10 ms starting at the point that it sees C_PORT_SUSPEND asserted for the port. The inaccuracy in Windows 8 is generally no more than 1 system clock period late, but it can be delayed by interrupt processing for other devices. I would expect the additional average delay to be no more than 1 ms. They didn't mention waiting for the PORT_RESUME bit to be cleared, like the Linux EHCI driver does. Alan, would you like me to ask them about that? Yes, please. You might also ask them what they meant by it sees C_PORT_SUSPEND asserted for the port, since there is no C_PORT_SUSPEND bit in the EHCI Port Status and Control registers. Were they talking about external hubs instead of the root hub? Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Wed, 21 Aug 2013, Sarah Sharp wrote: Background -- The USB 2.0 specification, section 7.1.7.7, says that upon device remote wakeup signaling, the first active hub (which is often the roothub) must rebroadcast the resume signaling for at least 20 ms (TDRSMDN). After that's done, the hub's suspend status change bit will be set, and system software must not access the device for at least 10 ms (TRSMRCY). It turns out that TRSMRCY is a *minimum*, not a *maximum*, according to Table 7-14. That means the port can actually take longer than TRSMRCY to resume. Any attempt to communicate with the device, or reset the device, will result in a USB device disconnect. By the way, I just noticed your Google+ posting about this. I think you (and perhaps the engineers you spoke with) may have misunderstood what Table 7-14 means when it lists 10 ms as the _minimum_ value for TRSMRCY. This delay value is a requirement on the OS. The host system must not access the device until at least 10 ms after the resume is complete. The system can wait longer than that if it wants -- that's why 10 ms is a minimum. It just has to avoid accessing the device sooner. A _minimum_ value on the host side translates into a _maximum_ value on the device side. The device can safely assume that it can spend up to 10 ms getting back into shape after a resume, but no more. After 10 ms, the host may try to communicate with it. Then, when the USB core calls into get port status, it transitions the port from the Resume state to the RExit state by changing the port link state to U0. The xHCI driver will get a port status change event when that transition is complete, but that port status change event is currently ignored. The excess delay you observe with xHCI is the time spent in the RExit substate? That probably should not be counted as part of the TRSMRCY period. It's hard to say for certain, because TRSMRCY is described only in the USB-2 spec and not in the xHCI spec, and vice versa for RExit. Still, it's reasonable to assume that the TRSMRCY period should begin when the port changes back to U0, not when it leaves the RESUME state and enters RExit. So in the end this appears to be a simple bug in xhci-hcd. The Get-Port-Status request that terminates the resume signalling should wait until the port goes back into U0 (which agrees with what you have already decided, of course). ehci-hcd does something similar: /* stop resume signaling */ temp = ~(PORT_RWC_BITS | PORT_SUSPEND | PORT_RESUME); ehci_writel(ehci, temp, status_reg); clear_bit(wIndex, ehci-resuming_ports); retval = ehci_handshake(ehci, status_reg, PORT_RESUME, 0, 2000 /* 2msec */); The ehci_handshake call busy-waits until the controller turns off the PORT_RESUME bit, which happens when the port has switched to a high-speed idle. It's supposed to take no more than 2 ms but hopefully is a lot faster. (Hmmm, maybe the private lock should be dropped during this handshake...) Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Wed, 21 Aug 2013, Sarah Sharp wrote: Possible fixes -- The USB core obviously needs to be changed to check the port status after the TRSMRCY timeout, and continue to wait if the port is still in the resuming state. I will have to study the EHCI port status diagrams in detail to figure out how the USB core can do this. As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. I can easily do this without the USB core being involved, by changing the xHCI driver to either: 1. Busy wait with xhci_handshake() in the xHCI get port status until the port is in U0. 2. Add a completion per xHCI port. In xHCI get port status, initiate U0 entry, and wait on the port's completion for up to 20 ms. In the port status change event code, complete that port's completion when the port is in U0 and the bus_state-resuming_ports bit is set. I would expect either of those to be adequate. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote: On Wed, 21 Aug 2013, Sarah Sharp wrote: Possible fixes -- The USB core obviously needs to be changed to check the port status after the TRSMRCY timeout, and continue to wait if the port is still in the resuming state. I will have to study the EHCI port status diagrams in detail to figure out how the USB core can do this. As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. Well, shoot, I thought I had solved world hunger, or at least USB power management issues. :) So basically it sounds like this is an xHCI specific issue, and probably not the root cause of the USB device disconnects we see under EHCI hosts. I guess the xHCI hardware engineers just assumed software would always wait for the interrupt from the port status change event, rather than using a simple 10 ms timer. I bet they didn't even realize that that the transition took longer than 10ms, because Windows waited for the port status change event. I can easily do this without the USB core being involved, by changing the xHCI driver to either: 1. Busy wait with xhci_handshake() in the xHCI get port status until the port is in U0. 2. Add a completion per xHCI port. In xHCI get port status, initiate U0 entry, and wait on the port's completion for up to 20 ms. In the port status change event code, complete that port's completion when the port is in U0 and the bus_state-resuming_ports bit is set. I would expect either of those to be adequate. I right, I think I'll do the busy wait, since 71% of the time it should return immediately. Completions are overkill here. Should I print a debugging message if the xHCI host exceeds 10ms? I would be nice to let hardware engineers know they're out of spec. Sarah Sharp -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Thu, Aug 22, 2013 at 02:49:07PM -0700, Sarah Sharp wrote: On Thu, Aug 22, 2013 at 10:42:49AM -0400, Alan Stern wrote: On Wed, 21 Aug 2013, Sarah Sharp wrote: Possible fixes -- The USB core obviously needs to be changed to check the port status after the TRSMRCY timeout, and continue to wait if the port is still in the resuming state. I will have to study the EHCI port status diagrams in detail to figure out how the USB core can do this. As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. Well, shoot, I thought I had solved world hunger, or at least USB power management issues. :) So basically it sounds like this is an xHCI specific issue, and probably not the root cause of the USB device disconnects we see under EHCI hosts. I guess the xHCI hardware engineers just assumed software would always wait for the interrupt from the port status change event, rather than using a simple 10 ms timer. I bet they didn't even realize that that the transition took longer than 10ms, because Windows waited for the port status change event. Why can't Linux do the same thing, and not worry about any timeout at all? Should I print a debugging message if the xHCI host exceeds 10ms? I would be nice to let hardware engineers know they're out of spec. Sure, but the odds of anyone of them enabling debugging, and then noticing this are slim to none. But if it makes us feel better pointing out hardware bugs (I know it makes me feel good), please do so. thanks, greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT RFC] USB: Fix USB device disconnects on resume.
On Thu, 22 Aug 2013, Sarah Sharp wrote: As far as EHCI is concerned, this is a non-problem. The closest analogy to the RExit-U0 transition is in the description of the Force Port Resume bit (bit 6) in Table 2-16 of the EHCI spec, where it says that the host controller must complete the transition to the high-speed idle state within 2 milliseconds of software setting the bit to a zero (which happens when the hub driver does its Get-Port-Status call). Thus, as soon as the TRSMRCY delay is finished, the device and the port are supposed to be ready. In fact, the hardware doesn't provide any means of telling whether they are ready or not. Well, shoot, I thought I had solved world hunger, or at least USB power management issues. :) So basically it sounds like this is an xHCI specific issue, and probably not the root cause of the USB device disconnects we see under EHCI hosts. Probably not. After all, USB-2 does not have the elaborate link training and other link management features that USB-3 does. What device disconnects do you see under EHCI? I don't recall hearing about them. I guess the xHCI hardware engineers just assumed software would always wait for the interrupt from the port status change event, rather than using a simple 10 ms timer. I bet they didn't even realize that that the transition took longer than 10ms, because Windows waited for the port status change event. What do you mean? EHCI doesn't have any port-status change event at the end of the 10-ms TRSMRCY delay. The only port-status change event in the hardware is right at the beginning, when the resume is initiated. The same is true of UHCI, but OHCI is different. I can easily do this without the USB core being involved, by changing the xHCI driver to either: 1. Busy wait with xhci_handshake() in the xHCI get port status until the port is in U0. 2. Add a completion per xHCI port. In xHCI get port status, initiate U0 entry, and wait on the port's completion for up to 20 ms. In the port status change event code, complete that port's completion when the port is in U0 and the bus_state-resuming_ports bit is set. I would expect either of those to be adequate. I right, I think I'll do the busy wait, since 71% of the time it should return immediately. Completions are overkill here. Although that will be adequate, it may not be ideal. You'll still get a 10-ms TRSMRCY delay following the busy wait. Unless you change the hub driver to eliminate that delay for xHCI host controllers. Should I print a debugging message if the xHCI host exceeds 10ms? I would be nice to let hardware engineers know they're out of spec. It can't hurt. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html