On Mon, Aug 10, 2020 at 08:34:36PM +0000, Mikolaj Kucharski wrote:
> >Synopsis: LTE mini-PCIe modem not showing up after reboot
> >Category: kernel
> >Environment:
> System : OpenBSD 6.7
> Details : OpenBSD 6.7-current (GENERIC.MP) #15: Sun Aug 9 17:48:40
> MDT 2020
>
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> I have PC Engines APU2 board with Sierra Wireless MC7455 LTE modem on
> mini-PCIe.
> When I cold boot the system, device is properly detected:
>
> uhub2 at uhub1 port 1 configuration 1 interface 0 "Advanced Micro Devices
> Hub" rev 2.00/0.18 addr 2
> umsm0 at uhub2 port 4 configuration 1 interface 0 "Sierra Wireless,
> Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
> 2.10/0.06 addr 3
> ucom0 at umsm0
> umsm1 at uhub2 port 4 configuration 1 interface 2 "Sierra Wireless,
> Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
> 2.10/0.06 addr 3
> ucom1 at umsm1
> umsm2 at uhub2 port 4 configuration 1 interface 3 "Sierra Wireless,
> Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
> 2.10/0.06 addr 3
> ucom2 at umsm2
> umsm3 at uhub2 port 4 configuration 1 interface 8 "Sierra Wireless,
> Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
> 2.10/0.06 addr 3
> ucom3 at umsm3
> umsm4 at uhub2 port 4 configuration 1 interface 10 "Sierra Wireless,
> Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
> 2.10/0.06 addr 3
> ucom4 at umsm4
>
> However when I reboot OpenBSD Sierra Wireless card is no longer detected by
> the OS and
> is not visible any more from the system:
>
> uhub2 at uhub1 port 1 configuration 1 interface 0 "Advanced Micro Devices
> Hub" rev 2.00/0.18 addr 2
>
> >How-To-Repeat:
> Cold boot APU2 with 1199:9071 Sierra Wireless card, then reboot the OS
> and card is missing.
>
> >Fix:
> Unknown. I've looked around and found following resources:
>
> - OpenBSD misc, broken EHCI USB on AMD chipset?
> https://marc.info/?t=151192838800001&r=1&w=2
>
> - PCI: Workaround AMD EHCI controller PME bug
> https://patchwork.kernel.org/patch/9797105/
>
> - usb: host: ehci: workaround PME bug on AMD EHCI controller
> https://patchwork.kernel.org/patch/9783041/
>
> - Linux kernel, pci_fixup_amd_ehci_pme() function
> https://github.com/torvalds/linux/blob/fc80c51fd4b23ec007e88d4c688f2cac1b8648e7/arch/x86/pci/fixup.c#L580-L593
>
> - SB700 Family Product Errata
> https://www.amd.com/system/files/TechDocs/46837.pdf
>
> - AMD SB700/710/750 Register Programming Requirements
> http://ftp.loongnix.org/doc/02data%20sheet/loongson3a/SB/42413_sb7xx_rpr_pub_1.00.pdf
after reading errata #11: Enabling EHCI Dynamic Clock Gating May Cause Bug Code
0xFE System Error
Description
A system error has been observed during extended S4 Hibernation or Reboot
cycling using the MS PWRTST
or other similar utility. The arbiter in the Southbridge that controls the down
stream memory traffic to the USB
controller does not fully support the EHCI clock gating feature. If the clock
gating feature in the EHCI
controller is enabled, the arbiter may transfer incorrect memory data to the
EHCI controller and cause the
controller to not respond back correctly to the USB driver or the device. In
such cases, the USB driver may
timeout and cause the operating system to report the system error.
Potential Effect on System
The problem may present itself as a system halt with an operating system stop
error message with bug check
code related to a USB driver failure. The typical operating system error
message is BUGCODE_USB_DRIVER
bug check value of 0x000000FE. The system error occurs mostly if there are USB
devices connected to the
system. The failure is intermittent and the failure rate may vary from one
system to another. On most systems
the failure has been observed to occur after a very large number of reboot
cycles (typically more than 1000
cycles). On a small number of systems the issue may be seen within two hundred
reboot cycles.
Suggested Workaround
A BIOS workaround is described in section 6.17.1 of the SB7xx Register
Programming Requirements
document (PID # 42413). The workaround involves disabling the EHCI Dynamic
Clock gating Power
Management feature in the USB EHCI controller. The feature, when disabled,
impacts the total Southbridge
power consumption by less than 10 mW.
6.17 EHCI Dynamic Clock Gating Feature
ASIC Rev Register Settings Function/Comment
All Revs SB7x0 EHCI_BAR 0xBC Bit[12] = 0 For normal operation, the clock gating
feature must be
disabled. At system reset, this bit is set to â1â. So, BIOS needs to
program this bit to â0â.
EHCI clock gating setting must be programmed in both the EHCI host controllers.
Bus-0, dev-18 fun 2 and Bus 0 dev-19 fun-2
A BIOS workaround is required to disable the EHCI Dynamic clock gating on
resume from S5/S4.
does this diff help?
Index: dev/pci/ehci_pci.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/ehci_pci.c,v
retrieving revision 1.31
diff -u -p -u -r1.31 ehci_pci.c
--- dev/pci/ehci_pci.c 2 May 2019 20:28:46 -0000 1.31
+++ dev/pci/ehci_pci.c 30 Aug 2020 00:22:00 -0000
@@ -67,6 +67,8 @@ struct ehci_pci_softc {
int ehci_sb700_match(struct pci_attach_args *pa);
+#define EHCI_HUDSON2_CLKGATE_REG 0xbc
+#define EHCI_HUDSON2_CLKGATE_ENABLE (1 << 12)
#define EHCI_SBx00_WORKAROUND_REG 0x50
#define EHCI_SBx00_WORKAROUND_ENABLE (1 << 3)
#define EHCI_VT6202_WORKAROUND_REG 0x48
@@ -131,6 +133,17 @@ ehci_pci_attach(struct device *parent, s
/* Handle quirks */
switch (PCI_VENDOR(pa->pa_id)) {
+ case PCI_VENDOR_AMD:
+ if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_HUDSON2_EHCI) {
+ pcireg_t value;
+
+ /* disable dynamic clock gating */
+ value = EREAD4(&sc->sc, EHCI_HUDSON2_CLKGATE_REG);
+ value &= ~EHCI_HUDSON2_CLKGATE_ENABLE;
+ EWRITE4(&sc->sc, EHCI_HUDSON2_CLKGATE_REG, value);
+ }
+ break;
+
case PCI_VENDOR_ATI:
if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_ATI_SB600_EHCI ||
(PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_ATI_SB700_EHCI &&