Public bug reported:

# Hard system freeze on ThinkPad X1 Carbon Gen 13 with Intel Lunar Lake
xe driver

## Summary

Complete hard system freeze (mouse cursor frozen, no SysRq, requires
power button reset) occurring every 10–50 hours of active use on a
Lenovo ThinkPad X1 Carbon Gen 13 with Intel Lunar Lake (Core Ultra 7
258V) and the xe GPU driver. Affects kernels 6.17.0-1017-oem,
6.17.0-1020-oem, and likely 6.17.0-1023-oem. No kernel oops, panic, or
drm/xe error messages are logged before the freeze — the system locks up
silently.

## Hardware

- **Machine**: Lenovo ThinkPad X1 Carbon Gen 13 (21NS000YGE)
- **CPU**: Intel Core Ultra 7 258V (family:6, model:189, stepping:1)
- **GPU**: Intel Lunar Lake [Intel Graphics] [8086:64a0] rev 04
- **GPU Driver**: xe (kernel module)
- **Display Server**: Wayland (GNOME Shell 46.0 / GDM)

## Software

- **OS**: Ubuntu 24.04.4 LTS (Noble Numbat)
- **Kernel**: 6.17.0-1020-oem (6.17.0-1020.20), also occurred on 
6.17.0-1017-oem (6.17.0-1017.17)
- **linux-firmware**: 20240318.git3b128b60-0ubuntu2.26
- **Mesa**: 25.2.8-0ubuntu0.24.04.1
- **GuC firmware**: xe/lnl_guc_70.bin v70.44.1 (kernel recommends v70.45.2 but 
it is not shipped in the linux-firmware package)
- **DMC firmware**: i915/xe2lpd_dmc.bin v2.18
- **HuC firmware**: xe/lnl_huc.bin v9.4.13
- **GSC firmware**: xe/lnl_gsc_1.bin v104.0.0.1161

## Kernel boot parameters

```
quiet splash audit=0 nmi_watchdog=1 xe.enable_psr=0 xe.enable_dc=0 
xe.wedged_mode=2 vt.handoff=1
```

Note: `xe.enable_psr=0`, `xe.enable_dc=0`, and `xe.wedged_mode=2` were
added as mitigation attempts — **none of them prevent the freeze**.

## Crash Pattern

System freezes during active desktop use (never during idle/sleep). The
entire system becomes completely unresponsive — mouse cursor frozen, no
VT switching, no SysRq response. Only a power button reset recovers the
system. Temperature at time of freeze is always well below thermal
limits (38–50°C).

### Boot history with durations (from journalctl --list-boots):

```
Boot -14: 273h33m  (2026-04-07 → 2026-04-18)  6.17.0-1017-oem  (includes 
suspend time)
Boot -13:  49h55m  (2026-04-19 → 2026-04-21)  6.17.0-1017-oem  CRASH
Boot -12:  37h07m  (2026-04-21 → 2026-04-23)  6.17.0-1017-oem  CRASH
Boot -11:  84h57m  (2026-04-23 → 2026-04-26)  6.17.0-1017-oem  CRASH
Boot -10:  14h34m  (2026-04-26 → 2026-04-27)  6.17.0-1017-oem  CRASH
Boot  -9:  48h01m  (2026-04-27 → 2026-04-29)  6.17.0-1017-oem  CRASH
Boot  -8:   9h14m  (2026-04-29 → 2026-04-29)  6.17.0-1017-oem  CRASH
Boot  -7: 191h08m  (2026-04-29 → 2026-05-07)  6.17.0-1017-oem  (includes 
suspend time)
Boot  -6:  43h27m  (2026-05-07 → 2026-05-09)  6.17.0-1020-oem  CRASH
Boot  -5:  21h42m  (2026-05-09 → 2026-05-10)  6.17.0-1020-oem  CRASH
Boot  -4:  46h20m  (2026-05-10 → 2026-05-12)  6.17.0-1020-oem  CRASH
Boot  -3:  24h50m  (2026-05-12 → 2026-05-13)  6.17.0-1020-oem  CRASH
Boot  -2:  48h55m  (2026-05-13 → 2026-05-15)  6.17.0-1020-oem  clean shutdown
Boot  -1:  10h57m  (2026-05-16 → 2026-05-16)  6.17.0-1020-oem  CRASH (freeze at 
20:25:41, temps 38-45°C)
```

Average time between freezes: approximately 10–50 hours of active use.

## Key Diagnostic Observations

### 1. No kernel/drm/xe errors before freeze

The journal shows **zero** xe, drm, or GPU-related error messages before
any crash. The last kernel message in boot -1 was a benign `perf:
interrupt took too long` at 18:47:03 — almost 2 hours before the freeze
at 20:25:41. The freeze is completely silent from the kernel's
perspective.

### 2. GuC firmware version mismatch

At boot the kernel reports:
```
xe 0000:00:02.0: [drm] GT0: Using GuC firmware from xe/lnl_guc_70.bin version 
70.44.1
xe 0000:00:02.0: [drm] GuC firmware (70.45.2) is recommended, but only 
(70.44.1) was found in xe/lnl_guc_70.bin
xe 0000:00:02.0: [drm] Consider updating your linux-firmware pkg
```

The installed linux-firmware package (20240318.git3b128b60-0ubuntu2.26)
ships GuC 70.44.1 but the kernel expects 70.45.2. This mismatch may
contribute to instability.

### 3. Compressed framebuffer warning

At boot:
```
xe 0000:00:02.0: [drm] Reducing the compressed framebuffer size. This may lead 
to less power savings than a non-reduced-size. Try to increase stolen memory 
size if available in BIOS.
```

### 4. Mitigations that do NOT help

- `xe.enable_psr=0` — disabling Panel Self Refresh: no effect
- `xe.enable_dc=0` — disabling display C-states: no effect
- `xe.wedged_mode=2` — setting GPU wedge recovery mode: no effect
- `nmi_watchdog=1` — NMI watchdog enabled but never fires (suggests the lockup 
is not a CPU soft-lockup but potentially a bus/MMIO hang that blocks NMI 
delivery)

### 5. Temperature is not a factor

Thinkfan logs from moments before the boot -1 freeze:
```
20:24:11 Temperatures(bias): 46(13), 40(0) -> Fans: level 3
20:24:16 Temperatures(bias): 39(0), 43(4) -> Fans: level 2
20:25:14 Temperatures(bias): 45(13), 39(0) -> Fans: level 3
20:25:17 Temperatures(bias): 38(0), 39(0) -> Fans: level 2
```

## xe driver init log (boot -1)

```
xe 0000:00:02.0: vgaarb: deactivate vga console
xe 0000:00:02.0: [drm] Found lunarlake (device ID 64a0) integrated display 
version 20.00 stepping B0
xe 0000:00:02.0: [drm] Finished loading DMC firmware i915/xe2lpd_dmc.bin (v2.18)
xe 0000:00:02.0: [drm] GT0: Using GuC firmware from xe/lnl_guc_70.bin version 
70.44.1
xe 0000:00:02.0: [drm] GuC firmware (70.45.2) is recommended, but only 
(70.44.1) was found
xe 0000:00:02.0: [drm] GT0: ccs1 fused off
xe 0000:00:02.0: [drm] GT0: ccs2 fused off
xe 0000:00:02.0: [drm] GT0: ccs3 fused off
xe 0000:00:02.0: [drm] GT1: Using GuC firmware from xe/lnl_guc_70.bin version 
70.44.1
xe 0000:00:02.0: [drm] GuC firmware (70.45.2) is recommended, but only 
(70.44.1) was found
xe 0000:00:02.0: [drm] GT1: Using HuC firmware from xe/lnl_huc.bin version 
9.4.13
xe 0000:00:02.0: [drm] GT1: Using GSC firmware from xe/lnl_gsc_1.bin version 
104.0.0.1161
xe 0000:00:02.0: [drm] GT1: vcs1-vcs7 fused off
xe 0000:00:02.0: [drm] GT1: vecs1-vecs3 fused off
xe 0000:00:02.0: [drm] Registered 3 planes with drm panic
xe 0000:00:02.0: [drm] fb0: xedrmfb frame buffer device
xe 0000:00:02.0: [drm] GT1: found GSC cv104.1.0
xe 0000:00:02.0: [drm] Reducing the compressed framebuffer size.
```

## lspci output

```
00:02.0 VGA compatible controller [0300]: Intel Corporation Lunar Lake [Intel 
Graphics] [8086:64a0] (rev 04)
        Subsystem: Lenovo Device [17aa:2339]
        Flags: bus master, fast devsel, latency 0, IRQ 182
        Memory at 204e000000 (64-bit, prefetchable) [size=16M]
        Memory at 2000000000 (64-bit, prefetchable) [size=256M]
        Kernel driver in use: xe
```

## Steps to Reproduce

1. Install Ubuntu 24.04.4 LTS on Lenovo ThinkPad X1 Carbon Gen 13 (Lunar Lake)
2. Boot with kernel 6.17.0-1017-oem or 6.17.0-1020-oem
3. Use the system normally (browser, IDE, desktop apps) under Wayland/GNOME
4. After 10–50 hours of active use, the system will hard-freeze

## Expected behavior

The system should not freeze during normal desktop use.

## Actual behavior

Complete hard freeze requiring power button reset. No kernel messages,
oops, or panics logged. The freeze appears to be at the hardware/bus
level, preventing even NMI watchdog from triggering.

## Potentially related upstream fixes (not yet in 6.17-oem)

The upstream kernel (7.x) has received multiple xe driver fixes in
April/May 2026 that may be relevant:

- "drm/xe/dma-buf: fix UAF with retry loop" (May 11, 2026)
- "drm/xe/dma-buf: handle empty bo and UAF races" (May 11, 2026)
- "drm/xe: Fix potential NULL deref in 
xe_exec_queue_tlb_inval_last_fence_put_unlocked" (Apr 29, 2026)
- "drm/xe: Fix dma-buf attachment leak in xe_gem_prime_import()" (Apr 29, 2026)
- "drm/xe: Fix bo leak in xe_dma_buf_init_obj() on allocation failure" (Apr 29, 
2026)
- Multiple BO leak and TLB fence fixes (Apr 29, 2026)

These fixes address use-after-free conditions and memory leaks in the xe
driver that could potentially cause the kind of silent hard lockup
observed here.

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: linux-oem-6.17 (not installed)
ProcVersionSignature: Ubuntu 6.17.0-1020.20-oem 6.17.13
Uname: Linux 6.17.0-1020-oem x86_64
NonfreeKernelModules: zfs
ApportVersion: 2.28.1-0ubuntu3.8
Architecture: amd64
AudioDevicesInUse:
 USER        PID ACCESS COMMAND
 /dev/snd/controlC0:  sebastian   4609 F.... pipewire
                      sebastian   4614 F.... wireplumber
 /dev/snd/seq:        sebastian   4609 F.... pipewire
CRDA: N/A
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Sat May 16 20:46:18 2026
InstallationDate: Installed on 2025-09-03 (255 days ago)
InstallationMedia: Ubuntu 24.04.3 LTS "Noble Numbat" - Release amd64 
(20250805.1)
MachineType: LENOVO 21NS000YGE
ProcFB: 0 xedrmfb
ProcKernelCmdLine: BOOT_IMAGE=/BOOT/ubuntu_znejf6@/vmlinuz-6.17.0-1020-oem 
root=ZFS=rpool/ROOT/ubuntu_znejf6 ro quiet splash audit=0 nmi_watchdog=1 
xe.enable_psr=0 xe.enable_dc=0 xe.wedged_mode=2 vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-6.17.0-1020-oem N/A
 linux-backports-modules-6.17.0-1020-oem  N/A
 linux-firmware                           20240318.git3b128b60-0ubuntu2.27
SourcePackage: linux-oem-6.17
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/19/2024
dmi.bios.release: 1.9
dmi.bios.vendor: LENOVO
dmi.bios.version: N4BET37W (1.09 )
dmi.board.asset.tag: Not Available
dmi.board.name: 21NS000YGE
dmi.board.vendor: LENOVO
dmi.board.version: SDK0T76576 WIN
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.ec.firmware.release: 1.13
dmi.modalias: 
dmi:bvnLENOVO:bvrN4BET37W(1.09):bd12/19/2024:br1.9:efr1.13:svnLENOVO:pn21NS000YGE:pvrThinkPadX1CarbonGen13:rvnLENOVO:rn21NS000YGE:rvrSDK0T76576WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_21NS_BU_Think_FM_ThinkPadX1CarbonGen13:
dmi.product.family: ThinkPad X1 Carbon Gen 13
dmi.product.name: 21NS000YGE
dmi.product.sku: LENOVO_MT_21NS_BU_Think_FM_ThinkPad X1 Carbon Gen 13
dmi.product.version: ThinkPad X1 Carbon Gen 13
dmi.sys.vendor: LENOVO

** Affects: linux-oem-6.17 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug noble wayland-session

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2152803

Title:
  Hard system freeze on ThinkPad X1 Carbon Gen 13 with Intel Lunar Lake
  xe driver

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.17/+bug/2152803/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to