On Mon, May 19, 2025 at 09:59:29PM -0700, Sathyanarayanan Kuppuswamy wrote: > On 5/19/25 2:35 PM, Bjorn Helgaas wrote: > > From: Jon Pan-Doh <pan...@google.com> > > > > Spammy devices can flood kernel logs with AER errors and slow/stall > > execution. Add per-device ratelimits for AER correctable and uncorrectable > > errors that use the kernel defaults (10 per 5s). > > > > There are two AER logging entry points: > > > > - aer_print_error() is used by DPC and native AER > > > > - pci_print_aer() is used by GHES and CXL > > > > The native AER aer_print_error() case includes a loop that may log details > > from multiple devices. This is ratelimited by the union of ratelimits for > > these devices, set by add_error_device(), which collects the devices. If > > no such device is found, the Error Source message is ratelimited by the > > Root Port or RCEC that received the ERR_* message. > > > > The DPC aer_print_error() case is currently not ratelimited. > > Can we also not rate limit fatal errors in AER driver?
In other words, only rate limit AER_CORRECTABLE and AER_NONFATAL for AER? Seems plausible to me.