From: Vernon Yang <[email protected]>

[ Upstream commit 0a27bdb14b028fed30a10cec2f945c38cb5ca4fa ]

The kzalloc(GFP_KERNEL) may return NULL, so all accesses to aer_info->xxx
will result in kernel panic. Fix it.

Signed-off-by: Vernon Yang <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---

LLM Generated explanations, may be completely bogus:

YES

**Why It Matters**
- Prevents a NULL pointer dereference and kernel panic during device
  enumeration when `kzalloc(GFP_KERNEL)` fails in AER initialization.
  This is a real bug users can hit under memory pressure and affects any
  kernel with `CONFIG_PCIEAER` enabled.

**Change Details**
- Adds a NULL check after allocating `dev->aer_info` and returns early
  on failure, resetting `dev->aer_cap` to keep state consistent:
  - drivers/pci/pcie/aer.c:395
  - drivers/pci/pcie/aer.c:396
  - drivers/pci/pcie/aer.c:397
- The dereferences that would otherwise panic immediately follow the
  allocation (ratelimit initialization), so without this guard, OOM
  leads to instant crash:
  - drivers/pci/pcie/aer.c:401
  - drivers/pci/pcie/aer.c:403

**Consistency With AER Flows**
- Resetting `dev->aer_cap` to 0 on allocation failure is correct and
  keeps all AER-related code paths coherent:
  - Save/restore explicitly no-op when `aer_cap == 0`, avoiding config
    space accesses:
    - drivers/pci/pcie/aer.c:349
    - drivers/pci/pcie/aer.c:371
  - AER enablement and ECRC setup get skipped because AER is treated as
    unavailable:
    - drivers/pci/pcie/aer.c:417 (enable reporting)
    - drivers/pci/pcie/aer.c:420 (ECRC)
    - ECRC helpers themselves also gate on `aer_cap`:
      - drivers/pci/pcie/aer.c:164
      - drivers/pci/pcie/aer.c:188
- Sysfs attributes that unconditionally dereference `pdev->aer_info` are
  already hidden when `aer_info == NULL`:
  - Visibility gating for stats attrs checks `pdev->aer_info`:
    - drivers/pci/pcie/aer.c:632
  - Visibility gating for ratelimit attrs checks `pdev->aer_info`:
    - drivers/pci/pcie/aer.c:769
- AER initialization is called during capability setup for every device;
  avoiding a panic here is critical:
  - drivers/pci/probe.c:2671

**Risk and Side Effects**
- Impact is limited and defensive:
  - On allocation failure, AER features are disabled for that device
    (graceful degradation) instead of panicking.
  - No architectural changes; no ABI changes; minimal lines touched.
  - All later AER users already handle `aer_info == NULL` and/or
    `aer_cap == 0` via existing guards.
- Side effects are intentional and safe:
  - Port driver IRQ message number programming for AER is skipped if
    `aer_cap == 0`, consistent with AER being unavailable:
    - drivers/pci/pcie/portdrv.c:81
    - drivers/pci/pcie/portdrv.c:242

**Stable Criteria**
- Fixes a real crash bug that can affect users (OOM during enumeration
  or hotplug).
- Small, contained change in a single function.
- No new features or interfaces; no architectural churn.
- Very low regression risk due to consistent gating on
  `aer_cap`/`aer_info`.

Given the clear correctness and robustness benefits with minimal risk,
this is a strong candidate for backporting to stable trees.

 drivers/pci/pcie/aer.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 9d23294ceb2f6..3dba9c0c6ae11 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -383,6 +383,10 @@ void pci_aer_init(struct pci_dev *dev)
                return;
 
        dev->aer_info = kzalloc(sizeof(*dev->aer_info), GFP_KERNEL);
+       if (!dev->aer_info) {
+               dev->aer_cap = 0;
+               return;
+       }
 
        ratelimit_state_init(&dev->aer_info->correctable_ratelimit,
                             DEFAULT_RATELIMIT_INTERVAL, 
DEFAULT_RATELIMIT_BURST);
-- 
2.51.0


Reply via email to