Re: [BUG] SD card reader disappears after suspend

2018-02-15 Thread Mathias Nyman

On 14.02.2018 15:06, Samuel Sadok wrote:

Thanks for the patches.

However no luck yet, the usb device still doesn't reinitialize.

Logs:
https://gist.github.com/90ccb390c5777b9e2d5dc595f44d50da

Can I somehow check if it is at all possible to bring back this
particular device after hub power loss?
For instance would you expect reloading all USB modules to bring back
the device? I tried that and it didn't help:


I'm starting to run out if ideas, maybe powering port off and on again
in the logical disconnect could be worth a shot, or maybe trying
to disable->re-enable port a bit earlier, before the disconnect

-Mathias



#!/bin/bash
echo "Unload modules..."
echo 1 > /sys/bus/pci/devices/\:00\:14.0/remove
rmmod xhci_pci
rmmod xhci_hcd
rmmod usbhid
rmmod bcm5974
rmmod btusb
rmmod uas
rmmod usb_storage
rmmod usbcore
rmmod usb_common
echo "wait"
sleep 2
echo "Reload modules..."
modprobe usb_common
modprobe usbcore
modprobe usb_storage
modprobe uas
modprobe btusb
modprobe bcm5974
modprobe usbhid
modprobe xhci_hcd
modprobe xhci_pci
echo 1 > /sys/bus/pci/rescan



2018-02-13 17:07 GMT+01:00 Mathias Nyman :

On 12.02.2018 16:03, Mathias Nyman wrote:


On 10.02.2018 02:01, Samuel Sadok wrote:


Thanks Mathias for looking into this.

2018-02-06 18:32 GMT+01:00 Mathias Nyman :



Does reverting
37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
devices
help?



Unfortunately not, the card reader is still missing after resume.

Here's the dmesg and usbmon (as previously, doing one suspend-resume
cycle):
https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd

For this log I also increased the usbmon buffer size as suggested by
Alan (to about 2MB). This (seemingly) resolved the issue with the log
gaps.


Ok, one reason reverting didn't help is that it we still don't really
disable and re-enable:

[  100.771564] usb usb2-port4: logical disconnect
...
[ 100.771586] usb usb2-port4: Not disabling port; link state is RxDetect


The reset resume of device "usb 2-4" (the device in question) happens
around [100.77].
In the usbmon log there is no activity at that exact time, only ~50ms
before and after. Can we infer from this that the issue is independent
from the actual device and must stem from some faulty state in the
kernel or USB controller?

Btw I also added/modified some debug lines for my own understanding,
those are tagged with "[CUSTOM LOG]".


* check that root cause for failing USB3 device reset after resume is
not
that several
xhci slots are in Default state at the same time. xHC can't handle
this.
In normal device enumeration usb core has a mutex protecting it, not
sure
it works here,
maybe usb core and xhci are out of sync after xHC reset?



I find this line in particular interesting:
[ 100.771536] xhci_hcd :00:14.0: [CUSTOM LOG] xHCI
xhci_urb_enqueue called with unaddressed device, slot_id = 1
This comes from xhci_check_args(). Since udev->slot_id == 1 is
non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
this particular device (usb 2-4), which I guess is not good. So to me
this does indeed look like the usb core and xhci are out of sync.

However I'm not familiar with the code (for instance what is slot_id
is for, who uses it, should it always be 0 on resume?) but based on
the log and what you wrote I guess this sounds like a good point:



xhci uses slot_id to identify different usb devices connected to it.
xHC hw gives each enabled attached usb device a slot_id.
When usb core asks xhci host to do something to a device xhci driver knows
which device based on udev->slot_id

when xhci controller is reset, all xhci slots are disabled and freed, but
usb core still has udev->slot_id pointers set.

In normal resume case the xHC controller is not reset, but if something
goes wrong,
or power is cut from xHC during suspend then we recover by resetting xHC
at resume.

I'll try to write some quick testpatches that:
- removes LPM and LTM disabling from usb_reset_and_verify_device
- zeroes udev->slot_id when slot is disabled and freed in xhci
- forces a disable/enable port after port reset failed a few times.



A series for testing can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
port-disable-test

https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=port-disable-test

Based on 4.15
Can you try it out?


-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info 

Re: [BUG] SD card reader disappears after suspend

2018-02-14 Thread Samuel Sadok
Thanks for the patches.

However no luck yet, the usb device still doesn't reinitialize.

Logs:
https://gist.github.com/90ccb390c5777b9e2d5dc595f44d50da

Can I somehow check if it is at all possible to bring back this
particular device after hub power loss?
For instance would you expect reloading all USB modules to bring back
the device? I tried that and it didn't help:

#!/bin/bash
echo "Unload modules..."
echo 1 > /sys/bus/pci/devices/\:00\:14.0/remove
rmmod xhci_pci
rmmod xhci_hcd
rmmod usbhid
rmmod bcm5974
rmmod btusb
rmmod uas
rmmod usb_storage
rmmod usbcore
rmmod usb_common
echo "wait"
sleep 2
echo "Reload modules..."
modprobe usb_common
modprobe usbcore
modprobe usb_storage
modprobe uas
modprobe btusb
modprobe bcm5974
modprobe usbhid
modprobe xhci_hcd
modprobe xhci_pci
echo 1 > /sys/bus/pci/rescan



2018-02-13 17:07 GMT+01:00 Mathias Nyman :
> On 12.02.2018 16:03, Mathias Nyman wrote:
>>
>> On 10.02.2018 02:01, Samuel Sadok wrote:
>>>
>>> Thanks Mathias for looking into this.
>>>
>>> 2018-02-06 18:32 GMT+01:00 Mathias Nyman :


 Does reverting
 37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
 devices
 help?
>>>
>>>
>>> Unfortunately not, the card reader is still missing after resume.
>>>
>>> Here's the dmesg and usbmon (as previously, doing one suspend-resume
>>> cycle):
>>> https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd
>>>
>>> For this log I also increased the usbmon buffer size as suggested by
>>> Alan (to about 2MB). This (seemingly) resolved the issue with the log
>>> gaps.
>>>
>> Ok, one reason reverting didn't help is that it we still don't really
>> disable and re-enable:
>>
>> [  100.771564] usb usb2-port4: logical disconnect
>> ...
>> [ 100.771586] usb usb2-port4: Not disabling port; link state is RxDetect
>>
>>> The reset resume of device "usb 2-4" (the device in question) happens
>>> around [100.77].
>>> In the usbmon log there is no activity at that exact time, only ~50ms
>>> before and after. Can we infer from this that the issue is independent
>>> from the actual device and must stem from some faulty state in the
>>> kernel or USB controller?
>>>
>>> Btw I also added/modified some debug lines for my own understanding,
>>> those are tagged with "[CUSTOM LOG]".
>>>
 * check that root cause for failing USB3 device reset after resume is
 not
 that several
xhci slots are in Default state at the same time. xHC can't handle
 this.
In normal device enumeration usb core has a mutex protecting it, not
 sure
 it works here,
maybe usb core and xhci are out of sync after xHC reset?
>>>
>>>
>>> I find this line in particular interesting:
>>> [ 100.771536] xhci_hcd :00:14.0: [CUSTOM LOG] xHCI
>>> xhci_urb_enqueue called with unaddressed device, slot_id = 1
>>> This comes from xhci_check_args(). Since udev->slot_id == 1 is
>>> non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
>>> this particular device (usb 2-4), which I guess is not good. So to me
>>> this does indeed look like the usb core and xhci are out of sync.
>>>
>>> However I'm not familiar with the code (for instance what is slot_id
>>> is for, who uses it, should it always be 0 on resume?) but based on
>>> the log and what you wrote I guess this sounds like a good point:
>>
>>
>> xhci uses slot_id to identify different usb devices connected to it.
>> xHC hw gives each enabled attached usb device a slot_id.
>> When usb core asks xhci host to do something to a device xhci driver knows
>> which device based on udev->slot_id
>>
>> when xhci controller is reset, all xhci slots are disabled and freed, but
>> usb core still has udev->slot_id pointers set.
>>
>> In normal resume case the xHC controller is not reset, but if something
>> goes wrong,
>> or power is cut from xHC during suspend then we recover by resetting xHC
>> at resume.
>>
>> I'll try to write some quick testpatches that:
>> - removes LPM and LTM disabling from usb_reset_and_verify_device
>> - zeroes udev->slot_id when slot is disabled and freed in xhci
>> - forces a disable/enable port after port reset failed a few times.
>>
>
> A series for testing can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> port-disable-test
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=port-disable-test
>
> Based on 4.15
> Can you try it out?
>
>
> -Mathias
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-02-13 Thread Mathias Nyman

On 12.02.2018 16:03, Mathias Nyman wrote:

On 10.02.2018 02:01, Samuel Sadok wrote:

Thanks Mathias for looking into this.

2018-02-06 18:32 GMT+01:00 Mathias Nyman :


Does reverting
37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
devices
help?


Unfortunately not, the card reader is still missing after resume.

Here's the dmesg and usbmon (as previously, doing one suspend-resume cycle):
https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd

For this log I also increased the usbmon buffer size as suggested by
Alan (to about 2MB). This (seemingly) resolved the issue with the log
gaps.


Ok, one reason reverting didn't help is that it we still don't really
disable and re-enable:

[  100.771564] usb usb2-port4: logical disconnect
...
[ 100.771586] usb usb2-port4: Not disabling port; link state is RxDetect


The reset resume of device "usb 2-4" (the device in question) happens
around [100.77].
In the usbmon log there is no activity at that exact time, only ~50ms
before and after. Can we infer from this that the issue is independent
from the actual device and must stem from some faulty state in the
kernel or USB controller?

Btw I also added/modified some debug lines for my own understanding,
those are tagged with "[CUSTOM LOG]".


* check that root cause for failing USB3 device reset after resume is not
that several
   xhci slots are in Default state at the same time. xHC can't handle this.
   In normal device enumeration usb core has a mutex protecting it, not sure
it works here,
   maybe usb core and xhci are out of sync after xHC reset?


I find this line in particular interesting:
[ 100.771536] xhci_hcd :00:14.0: [CUSTOM LOG] xHCI
xhci_urb_enqueue called with unaddressed device, slot_id = 1
This comes from xhci_check_args(). Since udev->slot_id == 1 is
non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
this particular device (usb 2-4), which I guess is not good. So to me
this does indeed look like the usb core and xhci are out of sync.

However I'm not familiar with the code (for instance what is slot_id
is for, who uses it, should it always be 0 on resume?) but based on
the log and what you wrote I guess this sounds like a good point:


xhci uses slot_id to identify different usb devices connected to it.
xHC hw gives each enabled attached usb device a slot_id.
When usb core asks xhci host to do something to a device xhci driver knows
which device based on udev->slot_id

when xhci controller is reset, all xhci slots are disabled and freed, but
usb core still has udev->slot_id pointers set.

In normal resume case the xHC controller is not reset, but if something goes 
wrong,
or power is cut from xHC during suspend then we recover by resetting xHC at 
resume.

I'll try to write some quick testpatches that:
- removes LPM and LTM disabling from usb_reset_and_verify_device
- zeroes udev->slot_id when slot is disabled and freed in xhci
- forces a disable/enable port after port reset failed a few times.



A series for testing can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git port-disable-test

https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=port-disable-test

Based on 4.15
Can you try it out?

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-02-12 Thread Mathias Nyman

On 10.02.2018 02:01, Samuel Sadok wrote:

Thanks Mathias for looking into this.

2018-02-06 18:32 GMT+01:00 Mathias Nyman :


Does reverting
37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
devices
help?


Unfortunately not, the card reader is still missing after resume.

Here's the dmesg and usbmon (as previously, doing one suspend-resume cycle):
https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd

For this log I also increased the usbmon buffer size as suggested by
Alan (to about 2MB). This (seemingly) resolved the issue with the log
gaps.


Ok, one reason reverting didn't help is that it we still don't really
disable and re-enable:

[  100.771564] usb usb2-port4: logical disconnect
...
[ 100.771586] usb usb2-port4: Not disabling port; link state is RxDetect


The reset resume of device "usb 2-4" (the device in question) happens
around [100.77].
In the usbmon log there is no activity at that exact time, only ~50ms
before and after. Can we infer from this that the issue is independent
from the actual device and must stem from some faulty state in the
kernel or USB controller?

Btw I also added/modified some debug lines for my own understanding,
those are tagged with "[CUSTOM LOG]".


* check that root cause for failing USB3 device reset after resume is not
that several
   xhci slots are in Default state at the same time. xHC can't handle this.
   In normal device enumeration usb core has a mutex protecting it, not sure
it works here,
   maybe usb core and xhci are out of sync after xHC reset?


I find this line in particular interesting:
[ 100.771536] xhci_hcd :00:14.0: [CUSTOM LOG] xHCI
xhci_urb_enqueue called with unaddressed device, slot_id = 1
This comes from xhci_check_args(). Since udev->slot_id == 1 is
non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
this particular device (usb 2-4), which I guess is not good. So to me
this does indeed look like the usb core and xhci are out of sync.

However I'm not familiar with the code (for instance what is slot_id
is for, who uses it, should it always be 0 on resume?) but based on
the log and what you wrote I guess this sounds like a good point:


xhci uses slot_id to identify different usb devices connected to it.
xHC hw gives each enabled attached usb device a slot_id.
When usb core asks xhci host to do something to a device xhci driver knows
which device based on udev->slot_id

when xhci controller is reset, all xhci slots are disabled and freed, but
usb core still has udev->slot_id pointers set.

In normal resume case the xHC controller is not reset, but if something goes 
wrong,
or power is cut from xHC during suspend then we recover by resetting xHC at 
resume.

I'll try to write some quick testpatches that:
- removes LPM and LTM disabling from usb_reset_and_verify_device
- zeroes udev->slot_id when slot is disabled and freed in xhci
- forces a disable/enable port after port reset failed a few times.

Thanks
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-02-09 Thread Samuel Sadok
Thanks Mathias for looking into this.

2018-02-06 18:32 GMT+01:00 Mathias Nyman :
>
> Does reverting
> 37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3
> devices
> help?

Unfortunately not, the card reader is still missing after resume.

Here's the dmesg and usbmon (as previously, doing one suspend-resume cycle):
https://gist.github.com/anonymous/5aea8eddf97e1b964bffd75ed88793fd

For this log I also increased the usbmon buffer size as suggested by
Alan (to about 2MB). This (seemingly) resolved the issue with the log
gaps.

The reset resume of device "usb 2-4" (the device in question) happens
around [100.77].
In the usbmon log there is no activity at that exact time, only ~50ms
before and after. Can we infer from this that the issue is independent
from the actual device and must stem from some faulty state in the
kernel or USB controller?

Btw I also added/modified some debug lines for my own understanding,
those are tagged with "[CUSTOM LOG]".

> * check that root cause for failing USB3 device reset after resume is not
> that several
>   xhci slots are in Default state at the same time. xHC can't handle this.
>   In normal device enumeration usb core has a mutex protecting it, not sure
> it works here,
>   maybe usb core and xhci are out of sync after xHC reset?

I find this line in particular interesting:
[ 100.771536] xhci_hcd :00:14.0: [CUSTOM LOG] xHCI
xhci_urb_enqueue called with unaddressed device, slot_id = 1
This comes from xhci_check_args(). Since udev->slot_id == 1 is
non-zero this implies that xhci->devs[udev->slot_id] must be NULL for
this particular device (usb 2-4), which I guess is not good. So to me
this does indeed look like the usb core and xhci are out of sync.

However I'm not familiar with the code (for instance what is slot_id
is for, who uses it, should it always be 0 on resume?) but based on
the log and what you wrote I guess this sounds like a good point:

> * make sure xhci sets udev->slot_id to zero when xhci slots are disabled.
>   Slots are disabled and freed when xHC is reset at resume in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-02-06 Thread Mathias Nyman

On 29.01.2018 16:24, Mathias Nyman wrote:

On 25.01.2018 17:51, Alan Stern wrote:

On Thu, 25 Jan 2018, Samuel Sadok wrote:


2018-01-23 17:31 GMT+01:00 Alan Stern :

On Tue, 23 Jan 2018, Samuel Sadok wrote:


Thanks Alan,

While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.

$ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
$ modprobe usbmon
$ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
$ # press power button

dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6

Both logs were collected at the same time so the timestamps should match.


In fact they do, quite closely.  But there is a noticeable gap in the
usbmon trace, between lines 197 and 198 (the timestamp jumps from
35923531 to 38925126) and there obviously was a lot of activity on bus
1 in between.


Yes you're right, that's inconvenient because the "failed to disable
LTM" message occurs exactly in this gap. Are gaps like this expected /
known? Any ideas on how to mitigate them? (I kinda want to avoid
soldering an external bus monitor to the mainboard :) )


The text interface to usbmon uses a fixed-size buffer, and the size is
a little on the low side if you're dealing with large latencies (which
of course are unavoidable during a suspend/resume scenario).  The size
is determined by EVENT_MAX, defined around line 40 in
drivers/usb/mon/mon_text.c:

#define EVENT_MAX  (4*PAGE_SIZE / sizeof(struct mon_event_text))

Just change the 4 to something considerably larger and you should be in
good shape.


Resume starts at dmesg line 1255.
As far as I can judge, the earliest indication of something going
wrong is line 1514:
[ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
[ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
[ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM


Yes, that's where the problem first seems to show up.  Apparently
usb_disable_ltm() fails because it can't communicate with the device,
and usb_reset_and_verify_device() then aborts because of that failure.
It shouldn't do this; after all, one very good reason for resetting the
device is because we're unable to communicate with it.

You could try editing the source code for usb_reset_and_verify_device()
in drivers/usb/core/hub.c, to see what happens if it doesn't give up
after usb_disable_ltm() fails.


For the following logs I modified the "Failed to disable LTM" message
and commented out the goto in the line below. Also I used kernel
version 4.15-rc9 this time. I omitted upgrading the out-of-tree
modules broadcom-wl and nouveau-fw, so don't mind errors related to
that.
The SD card reader is still missing after resume.

dmesg & usbmon:
https://gist.github.com/anonymous/fc597612d5ebbac40d7bef9f8f0b579a

`usb_reset_and_verify_device()` is executed around dmesg line 1524.
Again, the usbmon log has a major gap around that time.


It looks like the system is trying to carry out multiple warm resets,
or a single extremely long warm reset, when it really should give up
and do a cold reset instead.

It would be best to get Mathias's input on this.  I don't know what is
the right way to fix this problem.

Alan Stern



Just back form a week long sickleave, sorry about the delay.

First glance it looks like we end up in compliance mode a lot during
those warm resets.
I need to look at this in more detail


Does reverting
37be6676 usb: hub: Fix auto-remount of safely removed or ejected USB-3 devices
help?

Reverting it sets the USB3 port to RxDetect via ss.disabled after failing
to reset a few times. kind of disabling/re-enabling the USB3 port.
 
Doing a USB2 type "cold" bus reset (SE0) instead of a USB3 inband Hot or Warm

reset could solve this as well, but I can't find a way to force a USB2 style 
cold
reset on USB3 ports with xHCI.
Setting the xhci port PORTSC register PR bit will issue a USB2 bus reset
for USB2 protocol ports and Hot reset for USB3 protocol ports.
The warm reset is a separate bit (WPR) in portsc register.

Other things that needs attention (todo list):
* hub_port_wait_reset() in hub.c checks for USB_PORT_STAT_RESET, returning 
-EBUSY
 before checking  USB_PORT_STAT_CONNECTION, returning -ENOTCONN. This causes 
extra
 delays when doing several 200ms waits for warm reset to finish when the port 
should
 just be logically disconnected.
 
* Fix the failing disable_lpm and disable_ltm issues in usb_reset_and_verify_device()

  that Alan pointed out.

* make sure xhci sets 

Re: [BUG] SD card reader disappears after suspend

2018-01-29 Thread Mathias Nyman

On 25.01.2018 17:51, Alan Stern wrote:

On Thu, 25 Jan 2018, Samuel Sadok wrote:


2018-01-23 17:31 GMT+01:00 Alan Stern :

On Tue, 23 Jan 2018, Samuel Sadok wrote:


Thanks Alan,

While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.

$ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
$ modprobe usbmon
$ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
$ # press power button

dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6

Both logs were collected at the same time so the timestamps should match.


In fact they do, quite closely.  But there is a noticeable gap in the
usbmon trace, between lines 197 and 198 (the timestamp jumps from
35923531 to 38925126) and there obviously was a lot of activity on bus
1 in between.


Yes you're right, that's inconvenient because the "failed to disable
LTM" message occurs exactly in this gap. Are gaps like this expected /
known? Any ideas on how to mitigate them? (I kinda want to avoid
soldering an external bus monitor to the mainboard :) )


The text interface to usbmon uses a fixed-size buffer, and the size is
a little on the low side if you're dealing with large latencies (which
of course are unavoidable during a suspend/resume scenario).  The size
is determined by EVENT_MAX, defined around line 40 in
drivers/usb/mon/mon_text.c:

#define EVENT_MAX  (4*PAGE_SIZE / sizeof(struct mon_event_text))

Just change the 4 to something considerably larger and you should be in
good shape.


Resume starts at dmesg line 1255.
As far as I can judge, the earliest indication of something going
wrong is line 1514:
[ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
[ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
[ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM


Yes, that's where the problem first seems to show up.  Apparently
usb_disable_ltm() fails because it can't communicate with the device,
and usb_reset_and_verify_device() then aborts because of that failure.
It shouldn't do this; after all, one very good reason for resetting the
device is because we're unable to communicate with it.

You could try editing the source code for usb_reset_and_verify_device()
in drivers/usb/core/hub.c, to see what happens if it doesn't give up
after usb_disable_ltm() fails.


For the following logs I modified the "Failed to disable LTM" message
and commented out the goto in the line below. Also I used kernel
version 4.15-rc9 this time. I omitted upgrading the out-of-tree
modules broadcom-wl and nouveau-fw, so don't mind errors related to
that.
The SD card reader is still missing after resume.

dmesg & usbmon:
https://gist.github.com/anonymous/fc597612d5ebbac40d7bef9f8f0b579a

`usb_reset_and_verify_device()` is executed around dmesg line 1524.
Again, the usbmon log has a major gap around that time.


It looks like the system is trying to carry out multiple warm resets,
or a single extremely long warm reset, when it really should give up
and do a cold reset instead.

It would be best to get Mathias's input on this.  I don't know what is
the right way to fix this problem.

Alan Stern



Just back form a week long sickleave, sorry about the delay.

First glance it looks like we end up in compliance mode a lot during
those warm resets.
I need to look at this in more detail

-Mathias  


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-25 Thread Alan Stern
On Thu, 25 Jan 2018, Samuel Sadok wrote:

> 2018-01-23 17:31 GMT+01:00 Alan Stern :
> > On Tue, 23 Jan 2018, Samuel Sadok wrote:
> >
> >> Thanks Alan,
> >>
> >> While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.
> >>
> >> $ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
> >> $ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
> >> $ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
> >> $ modprobe usbmon
> >> $ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
> >> $ # press power button
> >>
> >> dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
> >> usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6
> >>
> >> Both logs were collected at the same time so the timestamps should match.
> >
> > In fact they do, quite closely.  But there is a noticeable gap in the
> > usbmon trace, between lines 197 and 198 (the timestamp jumps from
> > 35923531 to 38925126) and there obviously was a lot of activity on bus
> > 1 in between.
> 
> Yes you're right, that's inconvenient because the "failed to disable
> LTM" message occurs exactly in this gap. Are gaps like this expected /
> known? Any ideas on how to mitigate them? (I kinda want to avoid
> soldering an external bus monitor to the mainboard :) )

The text interface to usbmon uses a fixed-size buffer, and the size is
a little on the low side if you're dealing with large latencies (which
of course are unavoidable during a suspend/resume scenario).  The size
is determined by EVENT_MAX, defined around line 40 in
drivers/usb/mon/mon_text.c:

#define EVENT_MAX  (4*PAGE_SIZE / sizeof(struct mon_event_text))

Just change the 4 to something considerably larger and you should be in 
good shape.

> >> Resume starts at dmesg line 1255.
> >> As far as I can judge, the earliest indication of something going
> >> wrong is line 1514:
> >> [ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> >> unaddressed device
> >> [ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
> >> [ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> >> unaddressed device
> >> [ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
> >> [ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> >> unaddressed device
> >> [ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM
> >
> > Yes, that's where the problem first seems to show up.  Apparently
> > usb_disable_ltm() fails because it can't communicate with the device,
> > and usb_reset_and_verify_device() then aborts because of that failure.
> > It shouldn't do this; after all, one very good reason for resetting the
> > device is because we're unable to communicate with it.
> >
> > You could try editing the source code for usb_reset_and_verify_device()
> > in drivers/usb/core/hub.c, to see what happens if it doesn't give up
> > after usb_disable_ltm() fails.
> 
> For the following logs I modified the "Failed to disable LTM" message
> and commented out the goto in the line below. Also I used kernel
> version 4.15-rc9 this time. I omitted upgrading the out-of-tree
> modules broadcom-wl and nouveau-fw, so don't mind errors related to
> that.
> The SD card reader is still missing after resume.
> 
> dmesg & usbmon:
> https://gist.github.com/anonymous/fc597612d5ebbac40d7bef9f8f0b579a
> 
> `usb_reset_and_verify_device()` is executed around dmesg line 1524.
> Again, the usbmon log has a major gap around that time.

It looks like the system is trying to carry out multiple warm resets, 
or a single extremely long warm reset, when it really should give up 
and do a cold reset instead.

It would be best to get Mathias's input on this.  I don't know what is 
the right way to fix this problem.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-25 Thread Samuel Sadok
2018-01-23 17:31 GMT+01:00 Alan Stern :
> On Tue, 23 Jan 2018, Samuel Sadok wrote:
>
>> Thanks Alan,
>>
>> While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.
>>
>> $ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
>> $ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
>> $ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
>> $ modprobe usbmon
>> $ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
>> $ # press power button
>>
>> dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
>> usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6
>>
>> Both logs were collected at the same time so the timestamps should match.
>
> In fact they do, quite closely.  But there is a noticeable gap in the
> usbmon trace, between lines 197 and 198 (the timestamp jumps from
> 35923531 to 38925126) and there obviously was a lot of activity on bus
> 1 in between.

Yes you're right, that's inconvenient because the "failed to disable
LTM" message occurs exactly in this gap. Are gaps like this expected /
known? Any ideas on how to mitigate them? (I kinda want to avoid
soldering an external bus monitor to the mainboard :) )

>
>> Resume starts at dmesg line 1255.
>> As far as I can judge, the earliest indication of something going
>> wrong is line 1514:
>> [ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
>> unaddressed device
>> [ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
>> [ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
>> unaddressed device
>> [ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
>> [ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
>> unaddressed device
>> [ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM
>
> Yes, that's where the problem first seems to show up.  Apparently
> usb_disable_ltm() fails because it can't communicate with the device,
> and usb_reset_and_verify_device() then aborts because of that failure.
> It shouldn't do this; after all, one very good reason for resetting the
> device is because we're unable to communicate with it.
>
> You could try editing the source code for usb_reset_and_verify_device()
> in drivers/usb/core/hub.c, to see what happens if it doesn't give up
> after usb_disable_ltm() fails.

For the following logs I modified the "Failed to disable LTM" message
and commented out the goto in the line below. Also I used kernel
version 4.15-rc9 this time. I omitted upgrading the out-of-tree
modules broadcom-wl and nouveau-fw, so don't mind errors related to
that.
The SD card reader is still missing after resume.

dmesg & usbmon:
https://gist.github.com/anonymous/fc597612d5ebbac40d7bef9f8f0b579a

`usb_reset_and_verify_device()` is executed around dmesg line 1524.
Again, the usbmon log has a major gap around that time.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-23 Thread Alan Stern
On Tue, 23 Jan 2018, Samuel Sadok wrote:

> Thanks Alan,
> 
> While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.
> 
> $ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
> $ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
> $ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
> $ modprobe usbmon
> $ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
> $ # press power button
> 
> dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
> usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6
> 
> Both logs were collected at the same time so the timestamps should match.

In fact they do, quite closely.  But there is a noticeable gap in the
usbmon trace, between lines 197 and 198 (the timestamp jumps from
35923531 to 38925126) and there obviously was a lot of activity on bus
1 in between.

> Resume starts at dmesg line 1255.
> As far as I can judge, the earliest indication of something going
> wrong is line 1514:
> [ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> unaddressed device
> [ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
> [ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> unaddressed device
> [ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
> [ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
> unaddressed device
> [ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM

Yes, that's where the problem first seems to show up.  Apparently 
usb_disable_ltm() fails because it can't communicate with the device, 
and usb_reset_and_verify_device() then aborts because of that failure.
It shouldn't do this; after all, one very good reason for resetting the 
device is because we're unable to communicate with it.

You could try editing the source code for usb_reset_and_verify_device() 
in drivers/usb/core/hub.c, to see what happens if it doesn't give up 
after usb_disable_ltm() fails.

However, this does not explain the errors that occur later, notably 
around line 2008 or a little earlier:

[   42.819016] usb usb2-port4: not enabled, trying warm reset again...
[   42.819020] usb usb2-port4: Cannot enable. Maybe the USB cable is bad?
[   42.819110] xhci_hcd :00:14.0: Cannot set link state.
[   42.819125] usb usb2-port4: cannot disable (err = -32)
[   42.819180] usb usb2-port4: status 0341, change 0010, 5.0 Gb/s
[   42.819185] usb 2-4: USB disconnect, device number 2

Perhaps Mathias can take a look at this, and the error messages that
follow, and figure out what's going wrong.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-23 Thread Samuel Sadok
Thanks Alan,

While I was at it I also enabled debug logs for xhci_hcd and xhci_pci.

$ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control
$ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control
$ modprobe usbmon
$ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log
$ # press power button

dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341
usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6

Both logs were collected at the same time so the timestamps should match.

Resume starts at dmesg line 1255.
As far as I can judge, the earliest indication of something going
wrong is line 1514:
[ 36.087176] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087180] usb 2-4: Disable of device-initiated U1 failed.
[ 36.087212] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087224] usb 2-4: Disable of device-initiated U2 failed.
[ 36.087226] xhci_hcd :00:14.0: xHCI xhci_urb_enqueue called with
unaddressed device
[ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM

2018-01-22 22:39 GMT+01:00 Alan Stern :
> On Mon, 22 Jan 2018, Samuel Sadok wrote:
>
>> As requested by Chen Yu on bugzilla, I added no_console_suspend=1 to
>> the kernel cmdline and updated to the (roughly) latest kernel,
>> 4.14.14-1-ARCH.
>>
>> Here's the new log:
>> https://pastebin.com/buhEMyKk
>>
>> From a quick comparison with the old log I don't see any significant
>> difference, except that these messages
>>
>> [   32.671107] xhci_hcd :00:14.0: Cannot set link state.
>> [   32.671112] usb usb2-port4: cannot disable (err = -32)
>> [   32.671153] xhci_hcd :00:14.0: Cannot set link state.
>> [   32.671157] usb usb2-port4: cannot disable (err = -32)
>>
>> don't repeat forever.
>
> It would be helpful to include debugging information in your log.
> Before starting the suspend, do:
>
> echo module usbcore =p >/sys/kernel/debug/dynamic_debug/control
>
> You could also collect a usbmon trace while running the test; it might
> help too.
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-22 Thread Alan Stern
On Mon, 22 Jan 2018, Samuel Sadok wrote:

> As requested by Chen Yu on bugzilla, I added no_console_suspend=1 to
> the kernel cmdline and updated to the (roughly) latest kernel,
> 4.14.14-1-ARCH.
> 
> Here's the new log:
> https://pastebin.com/buhEMyKk
> 
> From a quick comparison with the old log I don't see any significant
> difference, except that these messages
> 
> [   32.671107] xhci_hcd :00:14.0: Cannot set link state.
> [   32.671112] usb usb2-port4: cannot disable (err = -32)
> [   32.671153] xhci_hcd :00:14.0: Cannot set link state.
> [   32.671157] usb usb2-port4: cannot disable (err = -32)
> 
> don't repeat forever.

It would be helpful to include debugging information in your log.  
Before starting the suspend, do:

echo module usbcore =p >/sys/kernel/debug/dynamic_debug/control

You could also collect a usbmon trace while running the test; it might 
help too.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] SD card reader disappears after suspend

2018-01-22 Thread Samuel Sadok
As requested by Chen Yu on bugzilla, I added no_console_suspend=1 to
the kernel cmdline and updated to the (roughly) latest kernel,
4.14.14-1-ARCH.

Here's the new log:
https://pastebin.com/buhEMyKk

>From a quick comparison with the old log I don't see any significant
difference, except that these messages

[   32.671107] xhci_hcd :00:14.0: Cannot set link state.
[   32.671112] usb usb2-port4: cannot disable (err = -32)
[   32.671153] xhci_hcd :00:14.0: Cannot set link state.
[   32.671157] usb usb2-port4: cannot disable (err = -32)

don't repeat forever.

2018-01-19 10:52 GMT+01:00 Samuel Sadok :
> Hi!
> I just noticed I hit the wrong reply button.
> Here's what I replied to Alexander:
>
> I just tried `echo 0 > /sys/bus/usb/devices/2-4/power/persist`,
> however the card reader is still missing after suspend/resume.
>
> Here's the new kernel log:
> [ 2241.861114] pcieport :00:1c.3: Enabling MPC IRBNCE
> [ 2241.861120] pcieport :00:1c.3: Intel PCH root port ACS workaround 
> enabled
> [ 2241.915528] PM: suspend entry (deep)
> [ 2241.915529] PM: Syncing filesystems ... done.
> [ 2241.925562] Freezing user space processes ... (elapsed 0.001 seconds) done.
> [ 2241.926816] OOM killer disabled.
> [ 2241.926817] Freezing remaining freezable tasks ... (elapsed 0.000
> seconds) done.
> [ 2241.927583] Suspending console(s) (use no_console_suspend to debug)
> [ 2241.927818] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> [ 2241.931991] sd 0:0:0:0: [sda] Stopping disk
> [ 2244.178848] thunderbolt :08:00.0: suspending...
> [ 2244.179533] thunderbolt :08:00.0: suspend finished
> [ 2244.179535] thunderbolt :08:00.0: stopping RX ring 0
> [ 2244.179541] thunderbolt :08:00.0: disabling interrupt at
> register 0x38200 bit 12 (0x1001 -> 0x1)
> [ 2244.179547] thunderbolt :08:00.0: stopping TX ring 0
> [ 2244.179552] thunderbolt :08:00.0: disabling interrupt at
> register 0x38200 bit 0 (0x1 -> 0x0)
> [ 2244.179555] thunderbolt :08:00.0: control channel stopped
> [ 2245.127874] ACPI: Preparing to enter system sleep state S3
> [ 2245.147745] ACPI: EC: event blocked
> [ 2245.147746] ACPI: EC: EC stopped
> [ 2245.147746] PM: Saving platform NVS memory
> [ 2245.147751] Disabling non-boot CPUs ...
> [ 2245.176484] smpboot: CPU 1 is now offline
> [ 2245.218761] smpboot: CPU 2 is now offline
> [ 2245.261380] smpboot: CPU 3 is now offline
> [ 2245.289072] smpboot: CPU 4 is now offline
> [ 2245.319017] smpboot: CPU 5 is now offline
> [ 2245.345797] smpboot: CPU 6 is now offline
> [ 2245.362552] smpboot: CPU 7 is now offline
> [ 2245.393870] ACPI: Low-level resume complete
> [ 2245.393931] ACPI: EC: EC started
> [ 2245.393932] PM: Restoring platform NVS memory
> [ 2245.396492] Enabling non-boot CPUs ...
> [ 2245.396546] x86: Booting SMP configuration:
> [ 2245.396547] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [ 2245.476842]  cache: parent cpu1 should not be sleeping
> [ 2245.660075] CPU1 is up
> [ 2245.660120] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [ 2245.733856]  cache: parent cpu2 should not be sleeping
> [ 2245.908088] CPU2 is up
> [ 2245.908132] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [ 2245.977616]  cache: parent cpu3 should not be sleeping
> [ 2246.147615] CPU3 is up
> [ 2246.147661] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [ 2246.154023]  cache: parent cpu4 should not be sleeping
> [ 2246.154213] CPU4 is up
> [ 2246.154236] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [ 2246.218683]  cache: parent cpu5 should not be sleeping
> [ 2246.408703] CPU5 is up
> [ 2246.408730] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [ 2246.474887]  cache: parent cpu6 should not be sleeping
> [ 2246.691669] CPU6 is up
> [ 2246.691698] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [ 2246.762074]  cache: parent cpu7 should not be sleeping
> [ 2247.013633] CPU7 is up
> [ 2247.091979] ACPI: Waking up from system sleep state S3
> [ 2247.152327] pcieport :00:1c.0: Enabling MPC IRBNCE
> [ 2247.152330] pcieport :00:1c.0: Intel PCH root port ACS workaround 
> enabled
> [ 2247.152552] pcieport :07:04.0: quirk: waiting for thunderbolt
> to reestablish PCI tunnels...
> [ 2247.152568] pcieport :07:03.0: quirk: waiting for thunderbolt
> to reestablish PCI tunnels...
> [ 2247.152570] pcieport :07:05.0: quirk: waiting for thunderbolt
> to reestablish PCI tunnels...
> [ 2247.152633] pcieport :07:06.0: quirk: waiting for thunderbolt
> to reestablish PCI tunnels...
> [ 2247.168603] pcieport :00:1c.3: Enabling MPC IRBNCE
> [ 2247.168609] pcieport :00:1c.2: Enabling MPC IRBNCE
> [ 2247.168610] pcieport :00:1c.3: Intel PCH root port ACS workaround 
> enabled
> [ 2247.168616] pcieport :00:1c.2: Intel PCH root port ACS workaround 
> enabled
> [ 2247.168730] thunderbolt :08:00.0: control channel starting...
> [ 2247.168733] thunderbolt :08:00.0: starting TX ring 0
> [ 2247.168748] thunderbolt :08:00.0: enabling interrupt at
> 

Re: [BUG] SD card reader disappears after suspend

2018-01-19 Thread Samuel Sadok
Hi!
I just noticed I hit the wrong reply button.
Here's what I replied to Alexander:

I just tried `echo 0 > /sys/bus/usb/devices/2-4/power/persist`,
however the card reader is still missing after suspend/resume.

Here's the new kernel log:
[ 2241.861114] pcieport :00:1c.3: Enabling MPC IRBNCE
[ 2241.861120] pcieport :00:1c.3: Intel PCH root port ACS workaround enabled
[ 2241.915528] PM: suspend entry (deep)
[ 2241.915529] PM: Syncing filesystems ... done.
[ 2241.925562] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 2241.926816] OOM killer disabled.
[ 2241.926817] Freezing remaining freezable tasks ... (elapsed 0.000
seconds) done.
[ 2241.927583] Suspending console(s) (use no_console_suspend to debug)
[ 2241.927818] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 2241.931991] sd 0:0:0:0: [sda] Stopping disk
[ 2244.178848] thunderbolt :08:00.0: suspending...
[ 2244.179533] thunderbolt :08:00.0: suspend finished
[ 2244.179535] thunderbolt :08:00.0: stopping RX ring 0
[ 2244.179541] thunderbolt :08:00.0: disabling interrupt at
register 0x38200 bit 12 (0x1001 -> 0x1)
[ 2244.179547] thunderbolt :08:00.0: stopping TX ring 0
[ 2244.179552] thunderbolt :08:00.0: disabling interrupt at
register 0x38200 bit 0 (0x1 -> 0x0)
[ 2244.179555] thunderbolt :08:00.0: control channel stopped
[ 2245.127874] ACPI: Preparing to enter system sleep state S3
[ 2245.147745] ACPI: EC: event blocked
[ 2245.147746] ACPI: EC: EC stopped
[ 2245.147746] PM: Saving platform NVS memory
[ 2245.147751] Disabling non-boot CPUs ...
[ 2245.176484] smpboot: CPU 1 is now offline
[ 2245.218761] smpboot: CPU 2 is now offline
[ 2245.261380] smpboot: CPU 3 is now offline
[ 2245.289072] smpboot: CPU 4 is now offline
[ 2245.319017] smpboot: CPU 5 is now offline
[ 2245.345797] smpboot: CPU 6 is now offline
[ 2245.362552] smpboot: CPU 7 is now offline
[ 2245.393870] ACPI: Low-level resume complete
[ 2245.393931] ACPI: EC: EC started
[ 2245.393932] PM: Restoring platform NVS memory
[ 2245.396492] Enabling non-boot CPUs ...
[ 2245.396546] x86: Booting SMP configuration:
[ 2245.396547] smpboot: Booting Node 0 Processor 1 APIC 0x2
[ 2245.476842]  cache: parent cpu1 should not be sleeping
[ 2245.660075] CPU1 is up
[ 2245.660120] smpboot: Booting Node 0 Processor 2 APIC 0x4
[ 2245.733856]  cache: parent cpu2 should not be sleeping
[ 2245.908088] CPU2 is up
[ 2245.908132] smpboot: Booting Node 0 Processor 3 APIC 0x6
[ 2245.977616]  cache: parent cpu3 should not be sleeping
[ 2246.147615] CPU3 is up
[ 2246.147661] smpboot: Booting Node 0 Processor 4 APIC 0x1
[ 2246.154023]  cache: parent cpu4 should not be sleeping
[ 2246.154213] CPU4 is up
[ 2246.154236] smpboot: Booting Node 0 Processor 5 APIC 0x3
[ 2246.218683]  cache: parent cpu5 should not be sleeping
[ 2246.408703] CPU5 is up
[ 2246.408730] smpboot: Booting Node 0 Processor 6 APIC 0x5
[ 2246.474887]  cache: parent cpu6 should not be sleeping
[ 2246.691669] CPU6 is up
[ 2246.691698] smpboot: Booting Node 0 Processor 7 APIC 0x7
[ 2246.762074]  cache: parent cpu7 should not be sleeping
[ 2247.013633] CPU7 is up
[ 2247.091979] ACPI: Waking up from system sleep state S3
[ 2247.152327] pcieport :00:1c.0: Enabling MPC IRBNCE
[ 2247.152330] pcieport :00:1c.0: Intel PCH root port ACS workaround enabled
[ 2247.152552] pcieport :07:04.0: quirk: waiting for thunderbolt
to reestablish PCI tunnels...
[ 2247.152568] pcieport :07:03.0: quirk: waiting for thunderbolt
to reestablish PCI tunnels...
[ 2247.152570] pcieport :07:05.0: quirk: waiting for thunderbolt
to reestablish PCI tunnels...
[ 2247.152633] pcieport :07:06.0: quirk: waiting for thunderbolt
to reestablish PCI tunnels...
[ 2247.168603] pcieport :00:1c.3: Enabling MPC IRBNCE
[ 2247.168609] pcieport :00:1c.2: Enabling MPC IRBNCE
[ 2247.168610] pcieport :00:1c.3: Intel PCH root port ACS workaround enabled
[ 2247.168616] pcieport :00:1c.2: Intel PCH root port ACS workaround enabled
[ 2247.168730] thunderbolt :08:00.0: control channel starting...
[ 2247.168733] thunderbolt :08:00.0: starting TX ring 0
[ 2247.168748] thunderbolt :08:00.0: enabling interrupt at
register 0x38200 bit 0 (0x0 -> 0x1)
[ 2247.168750] thunderbolt :08:00.0: starting RX ring 0
[ 2247.168766] thunderbolt :08:00.0: enabling interrupt at
register 0x38200 bit 12 (0x1 -> 0x1001)
[ 2247.168770] thunderbolt :08:00.0: resuming...
[ 2247.168771] thunderbolt :08:00.0: resetting switch at 0
[ 2247.176388] thunderbolt :08:00.0: 0: resuming switch
[ 2247.176883] thunderbolt :08:00.0: resume finished
[ 2247.228522] pcieport :00:1c.4: Enabling MPC IRBNCE
[ 2247.228525] pcieport :00:1c.4: Intel PCH root port ACS workaround enabled
[ 2247.229197] usb usb1: root hub lost power or was reset
[ 2247.229200] usb usb2: root hub lost power or was reset
[ 2247.229265] ACPI: EC: event unblocked
[ 2247.229322] ACPI: button: The lid device is not compliant to SW_LID.
[ 2247.239446] sd 

Re: [BUG] SD card reader disappears after suspend

2018-01-01 Thread Alexander Kappner
I've seen similar issues caused by USB persist. Try disabling it (echo 0 > 
/sys/bus/usb/devices//power/persist), and then resuming. 

On 01/01/2018 10:53 PM, Samuel Sadok wrote:
> Hello!
> 
> I'm observing a bug where the most prominent symptom is that the
> built-in SD card reader disappears after waking up from suspend.
> 
> Laptop: MacBookPro11,3
> USB controller: Intel Corporation 8 Series/C220 Series Chipset Family
> USB xHCI (rev 05) (prog-if 30 [XHCI])
> SD card reader: Apple Card Reader 05ac:8406
> 
> The bug was reported previously on bugzilla but never followed up on:
> https://bugzilla.kernel.org/show_bug.cgi?id=111201
> 
> 
> Steps to reproduce:
> $ lsusb
> Bus 002 Device 002: ID 05ac:8406 Apple, Inc.
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 001 Device 006: ID 05ac:8289 Apple, Inc.
> Bus 001 Device 002: ID 0a5c:4500 Broadcom Corp. BCM2046B1 USB 2.0 Hub
> (part of BCM2046 Bluetooth)
> Bus 001 Device 003: ID 05ac:0263 Apple, Inc. Apple Internal Keyboard /
> Trackpad (MacBook Retina)
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> $ sudo systemctl suspend
> [Wait 10s, then wake the system up]
> $ lsusb
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 001 Device 006: ID 05ac:8289 Apple, Inc.
> Bus 001 Device 002: ID 0a5c:4500 Broadcom Corp. BCM2046B1 USB 2.0 Hub
> (part of BCM2046 Bluetooth)
> Bus 001 Device 003: ID 05ac:0263 Apple, Inc. Apple Internal Keyboard /
> Trackpad (MacBook Retina)
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> $ lsusb -vvv
> 
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> [long hang, on the order of three minutes]
> can't get hub descriptor, LIBUSB_ERROR_TIMEOUT (Resource temporarily
> unavailable)
> [and a couple of more errors]
> 
> Expected result:
> The output of both lsusb commands should be equal.
> 
> Actual results:
>  - The SD card reader (05ac:8406) is missing from the second lsusb.
>  - Running lsusb -vvv hangs for a long time
>  - lots of error messages in kernel log
> 
> Logs:
> https://gist.github.com/samuelsadok/6d7b3e3015d3370a92ed4702e4d3c4b5
> 
> I'm happy to provide more data, do tests or patch around the kernel if it 
> helps.
> 
> Best,
> Samuel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] SD card reader disappears after suspend

2018-01-01 Thread Samuel Sadok
Hello!

I'm observing a bug where the most prominent symptom is that the
built-in SD card reader disappears after waking up from suspend.

Laptop: MacBookPro11,3
USB controller: Intel Corporation 8 Series/C220 Series Chipset Family
USB xHCI (rev 05) (prog-if 30 [XHCI])
SD card reader: Apple Card Reader 05ac:8406

The bug was reported previously on bugzilla but never followed up on:
https://bugzilla.kernel.org/show_bug.cgi?id=111201


Steps to reproduce:
$ lsusb
Bus 002 Device 002: ID 05ac:8406 Apple, Inc.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 006: ID 05ac:8289 Apple, Inc.
Bus 001 Device 002: ID 0a5c:4500 Broadcom Corp. BCM2046B1 USB 2.0 Hub
(part of BCM2046 Bluetooth)
Bus 001 Device 003: ID 05ac:0263 Apple, Inc. Apple Internal Keyboard /
Trackpad (MacBook Retina)
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
$ sudo systemctl suspend
[Wait 10s, then wake the system up]
$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 006: ID 05ac:8289 Apple, Inc.
Bus 001 Device 002: ID 0a5c:4500 Broadcom Corp. BCM2046B1 USB 2.0 Hub
(part of BCM2046 Bluetooth)
Bus 001 Device 003: ID 05ac:0263 Apple, Inc. Apple Internal Keyboard /
Trackpad (MacBook Retina)
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
$ lsusb -vvv

Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
[long hang, on the order of three minutes]
can't get hub descriptor, LIBUSB_ERROR_TIMEOUT (Resource temporarily
unavailable)
[and a couple of more errors]

Expected result:
The output of both lsusb commands should be equal.

Actual results:
 - The SD card reader (05ac:8406) is missing from the second lsusb.
 - Running lsusb -vvv hangs for a long time
 - lots of error messages in kernel log

Logs:
https://gist.github.com/samuelsadok/6d7b3e3015d3370a92ed4702e4d3c4b5

I'm happy to provide more data, do tests or patch around the kernel if it helps.

Best,
Samuel
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html