Flacusbigotis <[email protected]> writes: > The kernel logs indicating issues in Bullseye include a warning of a "host > failure" by xhci_hcd, and several write/read errors by the ax88179 ethernet > driver/module for the card, as follows: > > Feb 22 17:22:53 server1 kernel: [ 1.380198] xhci_hcd 0000:1c:00.0: xHCI > Host Controller > Feb 22 17:22:53 server1 kernel: [ 1.380205] xhci_hcd 0000:1c:00.0: new > USB bus registered, assigned bus number 5 > Feb 22 17:22:53 server1 kernel: [ 1.380209] xhci_hcd 0000:1c:00.0: Host > supports USB 3.0 SuperSpeed > Feb 22 17:22:53 server1 kernel: [ 1.380260] usb usb5: New USB device > found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.10 > Feb 22 17:22:53 server1 kernel: [ 1.380261] usb usb5: New USB device > strings: Mfr=3, Product=2, SerialNumber=1 > Feb 22 17:22:53 server1 kernel: [ 1.380263] usb usb5: Product: xHCI Host > Controller > Feb 22 17:22:53 server1 kernel: [ 1.380264] usb usb5: Manufacturer: > Linux 5.10.0-11-amd64 xhci-hcd > Feb 22 17:22:53 server1 kernel: [ 1.380265] usb usb5: SerialNumber: > 0000:1c:00.0 > Feb 22 17:22:53 server1 kernel: [ 1.380396] hub 5-0:1.0: USB hub found > Feb 22 17:22:53 server1 kernel: [ 1.380411] hub 5-0:1.0: 4 ports detected > Feb 22 17:22:53 server1 kernel: [ 5.508457] ax88179_178a 5-1:1.0 eth0: > register 'ax88179_178a' at usb-0000:1c:00.0-1, ASIX AX88179 USB 3.0 Gigabit > Ethernet, 00:11:22:33:44:55 > Feb 22 17:23:25 server1 kernel: [ 39.576966] xhci_hcd 0000:1c:00.0: > WARNING: Host System Error > Feb 22 17:26:00 server1 kernel: [ 194.596335] ax88179_178a 5-1:1.0 > enx001122334455: Failed to read reg index 0x0002: -22
I am guessing that the random mac address is a symptom caused by a failure to read the permanent mac from the USB ethernet controller. Which again probably is caused by one or more of these read errors. But I believe those are only symptoms, and that the real error is that unspecified "Host System Error". I wonder is this could be related to some of the quirks that have been added for this xhci controller since v4.19? There have been a few since the VL805 is used in the RPi4. Some of these might very well be misunderstood and RPI related only. There is also an odd code path in drivers/usb/host/pci-quirks.c where we select a different path on RPi than on other systems because "things are taken care of by the board's co-processor". I find that very suspiscious. And I must admit that my interest in this bug is because I'm worried that the quirk I recently pushed could have unexpected side effects... I have no clue. but the most likely cause is some power managenment issue. Test disabling ASPM e.g. by adding pcie_aspm=off to the kernel command line. Or disabling USB autosuspend, e.g by adding usbcore.autosuspend=-1 to the kernel command line. I do NOT suggest that you run with those settings by default. Only testing to try to narrow down the problem. It would also be intersting to know if removing the XHCI_LPM_SUPPORT quirk would make a difference, since this was added to the VL805 between v4.19 and v5.10 without anyone really knowing if it works.. But I can't figure out how to disable a device specific quirk like that without patching the kernel. Anyone? Bjørn

