Public bug reported:

(This report was AI generated, I read it and looks reasonable to me, but
I have no expertise with kernel debugging)

== Summary ==

Kernel 6.17.0-1021-oem introduced a regression in the nouveau driver that
causes a NULL pointer dereference during boot on Ada Lovelace NVIDIA GPUs
(PCI ID 10de:2db9). The previous kernel 6.17.0-1020-oem is unaffected.
The regression persists through 6.17.0-1025-oem (latest as of 2026-06-09).

Severity varies by kernel version, and can cascade into a full system freeze
due to memory corruption from the initial oops:

- 6.17.0-1023-oem: kernel panic on shutdown, system failed to boot on
  subsequent attempts
- 6.17.0-1024-oem: recoverable kernel oops (no panic), system boots via
  Intel iGPU but NVIDIA dGPU fails to initialize; oops messages flood all
  TTYs and terminal emulators (including tmux sessions)
- 6.17.0-1025-oem: initial oops cascades into full system freeze requiring
  hard reset. Observed sequence:
    1. NULL pointer dereference in bit_entry (same as previous kernels)
    2. Second oops (Oops #2) in idempotent_init_module — kernel already
       tainted G+D (DIE), memory corrupted by first oops
    3. CPU#12 soft lockup: modprobe stuck for 26s+ on
       native_queued_spin_lock_slowpath (spinlock deadlock)
    4. System completely unresponsive, required hard reset

Note: 6.17.0-1024 and 6.17.0-1025 contain identical kernel patches
(1025 is a packaging resync only). The difference in observed severity
between boots appears to be timing-dependent.

Workaround: reverted to 6.17.0-1020-oem which runs stably.

== Hardware ==

Ubuntu 24.04.4 LTS (Noble Numbat)
Dell Inc. Dell Pro Max 16 MC16250 (BIOS 1.13.1 03/31/2026)
NVIDIA GPU: 10de:2db9 (Ada Lovelace, GB207)

== Affected kernels == 6.17.0-1021-oem through 6.17.0-1025-oem
== Working kernel ==   6.17.0-1020-oem (currently running as workaround)

Note: on 6.17.0-1020-oem, nouveau also fails to initialize (GSP RPC timeout,
-ETIMEDOUT in r535_gsp_msgq_wait), but the failure is non-fatal — only a
KERNEL WARNING is emitted, no oops occurs, and the system falls back to i915
(Intel iGPU). This appears to be a pre-existing issue unrelated to the fwsec
regression.

== Kernel Oops (6.17.0-1024-oem) ==

BUG: kernel NULL pointer dereference, address: 00000000000000cc
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 2 UID: 0 PID: 1021 Comm: (udev-worker) Not tainted 6.17.0-1024-oem 
#24-Ubuntu PREEMPT(voluntary)
Hardware name: Dell Inc. Dell Pro Max 16 MC16250/0CFP7M, BIOS 1.13.1 03/31/2026
RIP: 0010:bit_entry+0x15/0x110 [nouveau]
Call Trace:
 nvbios_pmuTe+0x4e/0x100 [nouveau]
 nvbios_pmuEp+0x49/0xd0 [nouveau]
 nvkm_gsp_fwsec_init+0x70/0x2c0 [nouveau]
 nvkm_gsp_fwsec_sb_ctor+0x21/0x30 [nouveau]
 r535_gsp_rm_boot_ctor+0x25/0x110 [nouveau]
 r535_gsp_oneinit+0x264/0x320 [nouveau]
 gh100_gsp_oneinit+0x2cf/0x440 [nouveau]
 nvkm_gsp_oneinit+0x1f/0x40 [nouveau]
 ...
 nouveau_drm_probe+0xc6/0x220 [nouveau]

== Root cause ==

Upstream commit da67179e5538 ("drm/nouveau/gsp: Allocate fwsec-sb at boot")
was cherry-picked into linux-oem-6.17 in version 6.17.0-1021. This commit
makes fwsec-sb allocation unconditional, but Ada Lovelace and newer platforms
have a NULL nvkm_bios pointer at that point, triggering the crash.

== Fix available ==

The fix is already upstream: "drm/nouveau: don't attempt fwsec on sb on newer
platforms" by Dave Airlie, merged into drm-misc-fixes on 2026-01-07, tagged
Cc: [email protected] # v6.16+

Upstream mailing list:
https://www.mail-archive.com/[email protected]/msg50916.html

The fix is not present in linux-oem-6.17 as of 6.17.0-1025.25.
Please cherry-pick this fix into linux-oem-6.17.

** Affects: linux-oem-6.17 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: kernel-bug

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2156019

Title:
  linux-oem-6.17 6.17.0-1021+: nouveau kernel NULL pointer dereference
  on boot (regression from   6.17.0-1020)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.17/+bug/2156019/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to