On Sun, Jan 25, 2026 at 10:25:51AM +0100, Lukas Wunner wrote:
> Correctable and Uncorrectable Error Status Registers on reporting agents
> are cleared upon PCI device enumeration in pci_aer_init() to flush past
> events.  They're cleared again when an error is handled by the AER driver.

Do you think pci_aer_init() is the right time to clear the error
status bits?  Most of those bits are sticky, so they're not cleared by
reset.

I'm thinking about the scenario where a PCIe error occurs is captured
in the AER error status registers, but the system reboots before the
AER driver can log the error.  Since the bits are sticky, the new
kernel might have a chance to find and log the error that happened
with the previous kernel.

So I wonder if pci_aer_init() should just find the Capability and
alloc its buffers, and aer_probe() should look for existing errors and
log them before clearing them.

Of course enumeration will cause some errors (probably mostly
Unsupported Requests), and we wouldn't want to log all those.

Bjorn

Reply via email to