Public bug reported:

Truly grateful for whoever can help

The system experiences intermittent, unrecoverable full system freezes under 
amdgpu on a Lenovo ThinkPad T16 Gen 4 with AMD Ryzen AI 7 PRO 350 (Strix Point) 
and Radeon 860M (DCN 3.5). The freeze is not reproducible even under sustained 
heavy load it occurs unpredictably during light to moderate use (browsing, 
video playback). A hard power button reset is required each time. The keyboard 
becomes unresponsive during the freeze; the mouse cursor remains partially 
active before also freezing or resetting.
Initial investigation confirmed MES ring buffer overflow as the primary cause. 
Disabling MES (amdgpu.mes=0) resolved the MES errors but the freeze persists, 
now pointing to a separate DCN 3.5 / display scheduler or power state 
transition bug. Runtime PM is reported as unavailable for this GPU.

Device  Lenovo ThinkPad T16 Gen 4
CPU     AMD Ryzen AI 7 PRO 350 (Strix Point)
GPU     AMD Radeon 860M (DCN 3.5, RDNA 3.5)
RAM     32 GB + 8 GB swap
OS      Zorin OS 18.1 Core
Kernel  6.17.0-1017-oem
Desktop Environment     GNOME 46.0 on X11
Session Type    X11 (not Wayland)
Display Setup   Laptop screen (eDP) + 2x 2560x1440 via DisplayPort MST through 
USB dock
Dock Connection USB-C dock - Realtek RTL8153 Ethernet, Genesys Logic USB hubs, 
MST display chain
Current Kernel Params   amdgpu.mes=0 amdgpu.sg_display=0
DMUB Firmware Version   0x09002C01
VCN Firmware    ENC: 1.24 DEC: 9 VEP: 0 Revision: 13
amdgpu DRM Version      3.64.0

    • Full system freeze - no input response from keyboard or mouse
    • Mouse cursor occasionally remains active briefly before freezing or 
resetting
    • Keyboard completely unresponsive - Ctrl+Alt+F2 (TTY) and Magic SysRq do 
not work
    • Freeze is intermittent - NOT reproducible under sustained heavy GPU load
    • Occurs during light/moderate use: web browsing, video playback, switching 
tabs
    • System was stable for 1-2 hours before freezing - pattern suggests power 
state transition trigger
    • Hard power button reset required every time
    • No automatic GPU recovery occurs despite lockup timeout parameters being 
set
    • amdgpu Runtime PM reported as unavailable on this hardware

amdgpu.mes=0    Resolved MES ring buffer overflow. No MES errors on subsequent 
boots. Freeze persists.
amdgpu.sg_display=0     Added as precaution for scatter-gather display issues 
on Strix Point. No observable effect on freeze.
amdgpu.gpu_recovery=1   Set previously. GPU reset does not trigger — system 
hard freezes before watchdog fires.
amdgpu.lockup_timeout=1000      Set previously. Timeout does not recover system.
Brave hardware accel    Disabled in browser settings (brave://settings/system). 
Did not prevent freeze.
X11 instead of Wayland  Already on X11. Wayland ruled out as cause.
power_dpm_force_performance_level       pp_power_profile_mode not available 
(likely due to mes=0). DPM tuning limited.
linux-firmware  Confirmed up to date. Firmware loads cleanly with no errors.

Screen 0: current 5120 x 1440
eDP-1:       connected — 1920x1200 (laptop internal, currently inactive)
DisplayPort-7: connected — 2560x1440+2560+0 @ 59.95Hz (via MST dock)
DisplayPort-8: connected primary — 2560x1440+0+0 @ 59.95Hz (via MST dock)
Note: The two external displays are driven through an MST chain over 
DisplayPort via a USB-C dock. The laptop internal display (eDP-1) appears 
connected in xrandr but is not active (lid closed during docked use).

CPU (k10temp Tctl)      +50.4°C
GPU edge (amdgpu)       +48.0°C
GPU PPT 7.02 W
NVMe Composite  +39.9°C
Fan 1 / Fan 2   0 RPM (auto mode - normal at this temp)
ACPI CPU        +50.0°C

** Affects: linux-oem-6.11 (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "amdgpu_full_report.txt"
   
https://bugs.launchpad.net/bugs/2148266/+attachment/5961039/+files/amdgpu_full_report.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2148266

Title:
  amdgpu System Freeze

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.11/+bug/2148266/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to