Public bug reported:

Repeatable kernel panic: a NULL-pointer WRITE in __run_timers, raised from the 
timer
softirq while expiring a wheel timer. The faulting pointer is an 
already-unlinked list
node — RAX/R12 = 0xdead000000000122 (LIST_POISON2) and CR2 = 0 — i.e. the timer 
wheel
walks a poisoned/freed hlist entry. This is a use-after-free / double-unlink in 
the timer
wheel. The crash always lands in the idle task (swapper/N), in <IRQ> from the 
APIC timer.

It is 100% reproducible on this machine (steps below), usually within seconds 
to a few
minutes. It has occurred 8 times between 2026-06-14 18:56 and 2026-06-16 00:03.

Environment:
- Ubuntu 26.04 LTS
- Kernel 7.0.0-22-generic #22-Ubuntu (package 7.0.0-22.22)
- ALSO reproduces on 7.0.0-15-generic (7.0.0-15.15) — not a recent regression
- AMD Ryzen 9 5950X (16C/32T); Gigabyte X570S AORUS MASTER, BIOS F8i 
(2025-09-03)
- NVIDIA RTX 3090, open kernel module 595.71.05 (taints O; NOT in the fault 
path)

Signature:
- BUG: kernel NULL pointer dereference, address: 0000000000000000
- Oops: 0002 (supervisor write to a not-present page)
- RIP: __run_timers+0x1d8/0x2c0
- RAX = R12 = dead000000000122 (LIST_POISON2); CR2 = 0
- faulting insn mov %rax,(%rdx) with RDX = 0
- Comm: swapper/N (idle)

Two faulting paths, same root corruption:

(a) Default (tickless idle), remote/global expiry via timer-migration hierarchy:
    <IRQ>
    timer_expire_remote+0x52/0x90
    tmigr_handle_remote_cpu+0x10e/0x270
    tmigr_handle_remote_up+0x115/0x160
    tmigr_handle_remote+0xd5/0x140
    run_timer_softirq+0xeb/0x100
    handle_softirqs+0xe1/0x360

(b) With nohz=off, the remote/tmigr frames are gone but it STILL faults locally:
    <IRQ>
    call_timer_fn+0x30/0x170
    __run_timers+0x1af/0x2c0
    run_timer_softirq+0x8a/0x100
    handle_softirqs+0xe1/0x360
  => the wheel data is corrupted independently of the expiry path; removing the 
remote
     path just moves the fault to the local __run_timers -> call_timer_fn walk.

Reproducer:
- Firefox is the snap package; downloaded files are handed to a viewer through 
the
  xdg-document-portal FUSE mount (/run/user/<uid>/doc, fuse.portal).
- Download a file in Firefox, then Open it from the Firefox UI (PDF and ZIP 
both trigger).
- Within seconds-minutes a CPU's timer softirq hits the poisoned wheel entry 
and panics.
- Opening the same file via a non-sandboxed path (Files / xdg-open), which does 
NOT use
  the document-portal FUSE export, has NOT triggered it.

Mitigations tried, all ineffective:
- sysctl kernel.timer_migration=0 — still crashes via tmigr_handle_remote.
- Boot 7.0.0-15-generic — crashes identically.
- Boot 7.0.0-22-generic with nohz=off — still crashes (local path, (b) above).

Attachments: apport kernel .crash reports (embedded VmCoreDmesg), per-dump 
panic dmesgs,
and system info — see attached tmigr_crash_evidence_*.tar.gz.

Note: kernel is tainted W O (W = unrelated early-boot udev WARN, not in panic 
path;
O = out-of-tree NVIDIA module, not in the backtrace). Can attempt a repro 
without the
proprietary module on request.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "Crash evidence"
   
https://bugs.launchpad.net/bugs/2156855/+attachment/5977685/+files/tmigr_crash_evidence_20260616_003135.tar.gz

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2156855

Title:
  linux 7.0.0-22: timer-wheel use-after-free — NULL write in
  __run_timers (RAX=LIST_POISON2) from timer softirq, panics in idle
  task

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2156855/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to