Agreed with the initial analysis, there's nothing in device assignment
to limit to 32 devices except where downstream distros have
intentionally added a limit for support purposes. The issue here is
that the host hit a PCIe Downstream Port Containment uncorrectable
error, apparently causing at least a sub-hierarchy of the PCIe topology
to go offline. This is potentially more likely a hardware issue than a
software issue. It may be possible to mask the issue by unbinding the
interconnect devices in the affected sub-hierarchy from the dpc driver.
It might also be interesting to test with a subset of devices to
understand if there are specific devices triggering spurious DPC errors,
it may only be a sub-set or single device triggering spurious errors, or
perhaps it's the succession of bus resets for GPU assignment that
trigger such a fault. The system firmware logs may provide additional
information regarding the source(s) of the fault.
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
Not able to passthrough > 32 PCIe devices to a KVM Guest
To manage notifications about this bug go to:
ubuntu-bugs mailing list