Another data point — this time the failure mode was different, but I
think it's the same underlying issue, now with a better-quantified root
cause.

I added a 1-second-resolution watcher that tracks `/proc/meminfo` Shmem
plus, on any >300MB jump, a per-process snapshot of each process's `drm-
total-gtt` and exported DMA-BUF fd count/size from
`/proc/<pid>/fdinfo/*` (xe driver fields, available without root since
kernel 5.19).

Today (2026-06-15) gnome-shell did not SIGABRT — instead it hung
completely and was SIGKILLed by systemd after a timeout. Timeline:

- 10:10:22 - 10:14:57 (4m35s): gnome-shell's own `drm-total-gtt` grew from 
**113.8GB to 145.2GB** — about **6.85 GB/min**, continuously — while its 
exported DMA-BUF fd count grew from 303 to 365 (+62, ~13.5/min, each apparently 
a 32MB buffer). System-wide Shmem was only ~1-5GB during this window, well 
below OOM territory, so this growth is in GPU-driver-tracked buffer accounting, 
not yet visible as a Shmem spike.
- ~10:15:22 onward: `gsd-xsettings` started repeatedly failing "Failed to get 
current display configuration state: Timeout was reached" every ~30s — 
gnome-shell's D-Bus interface becoming unresponsive.
- 10:16:40: Shmem spiked 66MB -> 11.7GB (the same spike signature as my 
previous reports) and the kernel OOM killer fired (killed 
`evolution-source-registry`). At the same moment, gnome-shell's `drm-total-gtt` 
dropped from 145.2GB to 93.4GB — some of the buffer set was torn down/reclaimed 
under pressure.
- For the next ~108 seconds, gnome-shell produced zero journal output — fully 
hung.
- 10:18:29-34: `org.gnome.SettingsDaemon.Color.service` and then 
`[email protected]` failed with 'timeout', and systemd SIGKILLed 
gnome-shell and Xwayland. GDM restarted the session. The new gnome-shell's 
DMA-BUF fd count started at 5 (vs 365+ before), confirming the buffer 
accumulation resets on restart.

I think this confirms the working theory I noted earlier: this is
fundamentally a Mutter/xe GEM-buffer-reference leak (gnome-shell
continuously accumulates GTT-bound buffer objects without bound, at
multi-GB/minute rates during normal use). Once the accumulation is large
enough it appears to trigger the system-wide Shmem spike + OOM, and
depending on timing gnome-shell either SIGABRTs in the gallium
fence/buffer-swap path (my earlier reports in this bug) or hangs
entirely and gets SIGKILLed by systemd (today). I'd guess both are
downstream symptoms of the same buffer-accounting leak in Mutter's xe-
based renderer.

This is a much faster leak rate than I'd previously measured (an April
investigation found ~10.7GB of leaked DMA-BUF objects accumulating over
8 days of uptime; today's rate of >30GB growth in under 5 minutes is
orders of magnitude faster) — possibly the leak rate is bursty and
depends heavily on what's being rendered/composited. Happy to provide
the full 1s-resolution Shmem log and per-process GTT/DMA-BUF snapshots
if useful.


** Tags added: dma-buf memory-leak

** Also affects: mutter
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2156739

Title:
  gnome-shell crashes with SIGABRT in dri_create_fence_fd() (mesa-
  libgallium 25.2.8) on Arrow Lake-U / Xe2 iGPU during
  cogl_onscreen_swap_buffers_with_damage

To manage notifications about this bug go to:
https://bugs.launchpad.net/mutter/+bug/2156739/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to