On 10.02.2018 02:01, Samuel Sadok wrote:
Thanks Mathias for looking into this.

2018-02-06 18:32 GMT+01:00 Mathias Nyman <mathias.ny...@linux.intel.com>:

Does reverting
37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
devices
help?

Unfortunately not, the card reader is still missing after resume.

Here's the dmesg and usbmon (as previously, doing one suspend-resume cycle):
https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd

For this log I also increased the usbmon buffer size as suggested by
Alan (to about 2MB). This (seemingly) resolved the issue with the log
gaps.

Ok, one reason reverting didn't help is that it we still don't really
disable and re-enable:

[  100.771564] usb usb2-port4: logical disconnect
...
[ 100.771586] usb usb2-port4: Not disabling port; link state is RxDetect

The reset resume of device "usb 2-4" (the device in question) happens
around [100.77].
In the usbmon log there is no activity at that exact time, only ~50ms
before and after. Can we infer from this that the issue is independent
from the actual device and must stem from some faulty state in the
kernel or USB controller?

Btw I also added/modified some debug lines for my own understanding,
those are tagged with "[CUSTOM LOG]".

* check that root cause for failing USB3 device reset after resume is not
that several
   xhci slots are in Default state at the same time. xHC can't handle this.
   In normal device enumeration usb core has a mutex protecting it, not sure
it works here,
   maybe usb core and xhci are out of sync after xHC reset?

I find this line in particular interesting:
[ 100.771536] xhci_hcd 0000:00:14.0: [CUSTOM LOG] xHCI
xhci_urb_enqueue called with unaddressed device, slot_id = 1
This comes from xhci_check_args(). Since udev->slot_id == 1 is
non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
this particular device (usb 2-4), which I guess is not good. So to me
this does indeed look like the usb core and xhci are out of sync.

However I'm not familiar with the code (for instance what is slot_id
is for, who uses it, should it always be 0 on resume?) but based on
the log and what you wrote I guess this sounds like a good point:

xhci uses slot_id to identify different usb devices connected to it.
xHC hw gives each enabled attached usb device a slot_id.
When usb core asks xhci host to do something to a device xhci driver knows
which device based on udev->slot_id

when xhci controller is reset, all xhci slots are disabled and freed, but
usb core still has udev->slot_id pointers set.

In normal resume case the xHC controller is not reset, but if something goes 
wrong,
or power is cut from xHC during suspend then we recover by resetting xHC at 
resume.

I'll try to write some quick testpatches that:
- removes LPM and LTM disabling from usb_reset_and_verify_device
- zeroes udev->slot_id when slot is disabled and freed in xhci
- forces a disable/enable port after port reset failed a few times.

Thanks
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to