Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
Sure. Done. On 05/03/2018 07:04 PM, Mika Westerberg wrote: Could you then attach full dmesg of the failure without revert to the bugzilla bug? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On Thu, May 03, 2018 at 06:53:13PM +0200, Esokrates wrote: > Hi, > > Thanks very much for pointing out that commit! > Indeed, reverting makes the problem go away! > Also, interestingly it also makes the errors in > https://bugzilla.kernel.org/show_bug.cgi?id=199557 > go away, tested using 4.16.7! OK, good. Could you then attach full dmesg of the failure without revert to the bugzilla bug? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
Hi, Thanks very much for pointing out that commit! Indeed, reverting makes the problem go away! Also, interestingly it also makes the errors in https://bugzilla.kernel.org/show_bug.cgi?id=199557 go away, tested using 4.16.7! On 05/03/2018 05:18 PM, Mika Westerberg wrote: Could you try to revert: 13d3047c8150 ("ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()") and see if the problem goes away? -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On Thu, May 03, 2018 at 04:13:05PM +0300, Mathias Nyman wrote: > On 03.05.2018 12:30, Esokrates wrote: > > Hi,> Beginning with Linux 4.16 rc7 (4.16 rc6 was NOT affected), I do > > find the following regularly in dmesg (often it does not happen during > > boot, but after suspend to ram / resume): > > > > Hi > > Can you try 'git bisect' to find the patch that causes the issues? > (Adding Mika to Cc) Could you try to revert: 13d3047c8150 ("ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()") and see if the problem goes away? > > [ 216.443309] pcieport :00:1c.0: AER: Corrected error received: id=00e0 > > [ 216.443951] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, > > type=Physical Layer, id=00e0(Receiver ID) > > [ 216.444607] pcieport :00:1c.0: device [8086:9d10] error > > status/mask=0001/2000 > > [ 216.445300] pcieport :00:1c.0: [ 0] Receiver Error (First) > > [ 216.517886] xhci_hcd :39:00.0: remove, state 4 > > [ 216.518573] usb usb4: USB disconnect, device number 1 > > [ 216.519438] xhci_hcd :39:00.0: USB bus 4 deregistered > > [ 216.520320] xhci_hcd :39:00.0: xHCI host controller not responding, > > assume dead > > [ 216.521908] xhci_hcd :39:00.0: remove, state 4 > > [ 216.522950] usb usb3: USB disconnect, device number 1 > > [ 216.523891] xhci_hcd :39:00.0: Host halt failed, -19 > > [ 216.524994] xhci_hcd :39:00.0: Host not accessible, reset failed. > > [ 216.526153] xhci_hcd :39:00.0: USB bus 3 deregistered > > > > > > Running 4.16.0 I also observed > > > > [ 31.509282] ACPI: Waking up from system sleep state S3 > > [ 31.809429] ACPI: EC: interrupt unblocked > > [ 31.828849] pci_raw_set_power_state: 62 callbacks suppressed > > [ 31.828852] pcieport :01:00.0: Refused to change power state, > > currently in D3 > > [ 31.830422] pcieport :02:01.0: Refused to change power state, > > currently in D3 > > [ 31.830423] pcieport :02:02.0: Refused to change power state, > > currently in D3 > > [ 31.848853] pcieport :02:00.0: Refused to change power state, > > currently in D3 > > [ 31.852520] xhci_hcd :39:00.0: Refused to change power state, > > currently in D3 > > [ 31.872529] thunderbolt :03:00.0: Refused to change power state, > > currently in D3 > > [ 31.933970] thunderbolt :03:00.0: control channel starting... > > [ 31.937403] ACPI: EC: event unblocked > > [ 31.938385] sd 2:0:0:0: [sda] Starting disk > > [ 31.938886] ACPI: button: The lid device is not compliant to SW_LID. > > [ 31.956574] xhci_hcd :39:00.0: Refused to change power state, > > currently in D3 > > [ 31.956624] xhci_hcd :39:00.0: WARN: xHC restore state timeout > > [ 31.956631] xhci_hcd :39:00.0: PCI post-resume error -110! > > [ 31.956656] xhci_hcd :39:00.0: HC died; cleaning up > > [ 31.956658] xhci_hcd :39:00.0: HC died; cleaning up > > [ 31.956664] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110 > > [ 31.956668] PM: Device :39:00.0 failed to resume async: error -110 > > > > Furthermore sometimes I also get a bunch of these errors before the errors > > above: > > May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 > > May 03 10:58:51 debian kernel: usb 1-3: device not accepting address 2, > > error -71 > > > > All of this never happened before 4.16rc7. All kernels since 4.16.rc7 are > > reproducibly affected (suspend/resume helps triggering), including > > 4.17.0rc3. > > > > My hardware is a XPS 13 9360 Kabylake, lsub and lspci output are attached. > > > > I am not subscribed to the mailing list, so please CC me when replying to > > the list. > > > > Thanks very much! -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] xHCI host controller not responding, assume dead on Dell XPS 13 9360
On 03.05.2018 12:30, Esokrates wrote: Hi,> Beginning with Linux 4.16 rc7 (4.16 rc6 was NOT affected), I do find the following regularly in dmesg (often it does not happen during boot, but after suspend to ram / resume): Hi Can you try 'git bisect' to find the patch that causes the issues? (Adding Mika to Cc) [ 216.443309] pcieport :00:1c.0: AER: Corrected error received: id=00e0 [ 216.443951] pcieport :00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID) [ 216.444607] pcieport :00:1c.0: device [8086:9d10] error status/mask=0001/2000 [ 216.445300] pcieport :00:1c.0: [ 0] Receiver Error (First) [ 216.517886] xhci_hcd :39:00.0: remove, state 4 [ 216.518573] usb usb4: USB disconnect, device number 1 [ 216.519438] xhci_hcd :39:00.0: USB bus 4 deregistered [ 216.520320] xhci_hcd :39:00.0: xHCI host controller not responding, assume dead [ 216.521908] xhci_hcd :39:00.0: remove, state 4 [ 216.522950] usb usb3: USB disconnect, device number 1 [ 216.523891] xhci_hcd :39:00.0: Host halt failed, -19 [ 216.524994] xhci_hcd :39:00.0: Host not accessible, reset failed. [ 216.526153] xhci_hcd :39:00.0: USB bus 3 deregistered Running 4.16.0 I also observed [ 31.509282] ACPI: Waking up from system sleep state S3 [ 31.809429] ACPI: EC: interrupt unblocked [ 31.828849] pci_raw_set_power_state: 62 callbacks suppressed [ 31.828852] pcieport :01:00.0: Refused to change power state, currently in D3 [ 31.830422] pcieport :02:01.0: Refused to change power state, currently in D3 [ 31.830423] pcieport :02:02.0: Refused to change power state, currently in D3 [ 31.848853] pcieport :02:00.0: Refused to change power state, currently in D3 [ 31.852520] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.872529] thunderbolt :03:00.0: Refused to change power state, currently in D3 [ 31.933970] thunderbolt :03:00.0: control channel starting... [ 31.937403] ACPI: EC: event unblocked [ 31.938385] sd 2:0:0:0: [sda] Starting disk [ 31.938886] ACPI: button: The lid device is not compliant to SW_LID. [ 31.956574] xhci_hcd :39:00.0: Refused to change power state, currently in D3 [ 31.956624] xhci_hcd :39:00.0: WARN: xHC restore state timeout [ 31.956631] xhci_hcd :39:00.0: PCI post-resume error -110! [ 31.956656] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956658] xhci_hcd :39:00.0: HC died; cleaning up [ 31.956664] dpm_run_callback(): pci_pm_resume+0x0/0xb0 returns -110 [ 31.956668] PM: Device :39:00.0 failed to resume async: error -110 Furthermore sometimes I also get a bunch of these errors before the errors above: May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:49 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:50 debian kernel: usb 1-3: device descriptor read/64, error -71 May 03 10:58:51 debian kernel: usb 1-3: device not accepting address 2, error -71 All of this never happened before 4.16rc7. All kernels since 4.16.rc7 are reproducibly affected (suspend/resume helps triggering), including 4.17.0rc3. My hardware is a XPS 13 9360 Kabylake, lsub and lspci output are attached. I am not subscribed to the mailing list, so please CC me when replying to the list. Thanks very much! -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html