On 8/4/2025 5:17 PM, Breno Leitao wrote:
Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
does not rate limit, given this is fatal.
This prevents a kernel crash triggered by dereferencing a NULL pointer
in aer_ratelimit(), ensuring safer handling of PCI devices that lack
AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
which already performs this NULL check.
The enqueue side has lock to protect the ring, but the dequeue side no
lock held.
The kfifo_get in
static void aer_recover_work_func(struct work_struct *work)
{
...
while (kfifo_get(&aer_recover_ring, &entry)) {
...
}
should be replaced by
kfifo_out_spinlocked()
as
static void aer_recover_work_func(struct work_struct *work)
{
...
while (kfifo_out_spinlocked(&aer_recover_ring,
&entry,1`,&aer_recover_ring_lock )) {
...
}
Thanks,
Ethan
Signed-off-by: Breno Leitao <lei...@debian.org>
Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error
logging")
---
drivers/pci/pcie/aer.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 70ac661883672..b5f96fde4dcda 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev
*pdev,
static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
{
+ if (!dev->aer_info)
+ return 1;
+
switch (severity) {
case AER_NONFATAL:
return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
---
base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
change-id: 20250801-aer_crash_2-b21cc2ef0d00
Best regards,
--
Breno Leitao <lei...@debian.org>