Hi,

On Wed, Jun 18, 2025 at 06:26:53PM -0500, luis wrote:
> Package: src:linux
> Version: 6.12.30-1~bpo12+1
> Severity: important
> File: 6.12.30+bpo-amd64
> X-Debbugs-Cc: g76joe...@mozmail.com
> 
> Dear Maintainer,
> 
> Dear Maintainer,
> 
> ### What led up to the situation?
> - Upgraded kernel from `6.12.27+bpo-amd64` to `6.12.30+bpo-amd64` (Debian 12
> "Bookworm" backports).
> 
> ### What exactly happened?
> - After updating the kernel, the system experiences **random and complete
> freezes**.
> - When a freeze occurs, the screen eventually turns black, and there is **no
> access to TTYs** or any other system response.
> - A hard reboot is required to regain control.
> - The issue **does not occur** when booting with the previous kernel version,
> `6.12.27+bpo-amd64`.
> 
> ### Expected outcome:
> - A stable system without freezes, similar to the behavior with kernel
> `6.12.27+bpo-amd64`.
> 
> ### Hardware:
> - Motherboard: ASUS B350M-A bios v6232
> - CPU: AMD Ryzen 5 3600
> - GPU: AMD Radeon RX 6600
> - RAM: 16GB
> - Disk: KINGSTON SA400M8480G
> 
> ### Error logs (journalctl -b -1 -p 0..3):
> jun 18 11:14:41 r5-3600 kernel: ldm_parse_tocblock(): Cannot find TOCBLOCK,
> database may be corrupt.
> jun 18 11:14:41 r5-3600 kernel: ldm_parse_tocblock(): Cannot find TOCBLOCK,
> database may be corrupt.
> jun 18 11:14:41 r5-3600 kernel: usb 5-3: unable to read config index 0
> descriptor/all
> jun 18 11:14:41 r5-3600 kernel: usb 5-3: can't read configurations, error -110
> jun 18 11:14:41 r5-3600 kernel: hid-generic 0003:0D8C:0005.0001: No inputs
> registered, leaving
> jun 18 11:14:42 r5-3600 bluetoothd[933]: src/plugin.c:plugin_init() Failed to
> init vcp plugin
> jun 18 11:14:42 r5-3600 bluetoothd[933]: src/plugin.c:plugin_init() Failed to
> init mcp plugin
> jun 18 11:14:42 r5-3600 bluetoothd[933]: src/plugin.c:plugin_init() Failed to
> init bap plugin
> jun 18 11:14:42 r5-3600 bluetoothd[933]:
> profiles/sap/server.c:sap_server_register() Sap driver initialization failed.
> jun 18 11:14:42 r5-3600 bluetoothd[933]: sap-server: Operation not permitted
> (1)
> jun 18 11:14:43 r5-3600 bluetoothd[933]: Failed to set mode: Failed (0x03)
> jun 18 11:14:44 r5-3600 smartd[939]: Device: /dev/sda [SAT], 8 Currently
> unreadable (pending) sectors
> jun 18 11:14:50 r5-3600 sddm-helper[1548]: gkr-pam: unable to locate daemon
> control file
> jun 18 11:14:50 r5-3600 systemd[1565]: Failed to start fluidsynth.service -
> FluidSynth Daemon.
> jun 18 11:14:54 r5-3600 bluetoothd[933]: Failed to set mode: Failed (0x03)
> jun 18 11:22:22 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:22 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:26 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:31 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:31 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:36 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:40 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:40 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:45 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:50 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:50 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:50 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:50 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:50 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:54 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:59 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:22:59 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:22:59 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.0.0
> timeout, signaled seq=113209, emitted seq=113211
> jun 18 11:22:59 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Process
> information: process plasmashell pid 1813 thread plasmashel:cs0 pid 1828
> jun 18 11:23:04 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:23:08 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:23:08 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> gfxoff!
> jun 18 11:23:09 r5-3600 kernel: amdgpu 0000:0a:00.0: [drm] *ERROR*
> dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
> jun 18 11:23:09 r5-3600 kernel: amdgpu 0000:0a:00.0: [drm] *ERROR*
> dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
> jun 18 11:23:12 r5-3600 kernel: amdgpu 0000:0a:00.0: [drm] *ERROR*
> dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
> jun 18 11:23:21 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:23:21 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Failed to disable
> smu features.
> jun 18 11:23:27 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: SMU: I'm not done
> with your previous command: SMN_C2PMSG_66:0x00000029 SMN_C2PMSG_82:0x00000000
> jun 18 11:23:27 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: GPU mode1 reset
> failed
> jun 18 11:23:27 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: ASIC reset failed
> with error, -62 for drm dev, 0000:0a:00.0
> jun 18 11:23:27 r5-3600 kernel: snd_hda_intel 0000:0a:00.1: CORB reset
> timeout#2, CORBRP = 65535
> jun 18 11:23:27 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: GPU Recovery
> Failed: -62
> jun 18 11:23:37 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.1.0
> timeout, signaled seq=25194, emitted seq=25196
> jun 18 11:23:37 r5-3600 kernel: amdgpu 0000:0a:00.0: amdgpu: Process
> information: process kwin_wayland pid 1669 thread kwin_wayla:cs0 pid 1685
> jun 18 11:24:36 r5-3600 kernel: INFO: task kworker/u49:12:333 blocked for more
> than 120 seconds.
> jun 18 11:24:36 r5-3600 kernel:       Tainted: G           OE      
> 6.12.30+bpo-
> amd64 #1 Debian 6.12.30-1~bpo12+1
> jun 18 11:24:36 r5-3600 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> jun 18 11:24:36 r5-3600 kernel: INFO: task kworker/u52:4:1552 blocked for more
> than 120 seconds.
> jun 18 11:24:36 r5-3600 kernel:       Tainted: G           OE      
> 6.12.30+bpo-
> amd64 #1 Debian 6.12.30-1~bpo12+1
> jun 18 11:24:36 r5-3600 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> jun 18 11:24:36 r5-3600 kernel: INFO: task kworker/u52:9:1557 blocked for more
> than 120 seconds.
> jun 18 11:24:36 r5-3600 kernel:       Tainted: G           OE      
> 6.12.30+bpo-
> amd64 #1 Debian 6.12.30-1~bpo12+1
> jun 18 11:24:36 r5-3600 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> jun 18 11:24:36 r5-3600 kernel: INFO: task kworker/u52:15:1563 blocked for 
> more
> than 120 seconds.
> jun 18 11:24:36 r5-3600 kernel:       Tainted: G           OE      
> 6.12.30+bpo-
> amd64 #1 Debian 6.12.30-1~bpo12+1
> jun 18 11:24:36 r5-3600 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> 
> 
> ### Error Analysis
> The logs consistently show critical errors related to the `amdgpu` driver:
> - Repeated `amdgpu: SMU: I'm not done with your previous command` errors,
> indicating issues with the System Management Unit.
> - Multiple `amdgpu: Failed to disable gfxoff!` messages, suggesting power
> management state failures.
> - GPU timeouts (`amdgpu: ring gfx_0.0.0 timeout` and `gfx_0.1.0 timeout`)
> pointing to the graphics command processor becoming unresponsive.
> - ASIC reset failures (`amdgpu: GPU mode1 reset failed`, `amdgpu: ASIC reset
> failed with error, -62`), which are critical hardware-level failures.
> - DMCUB errors (`[drm] *ERROR* dc_dmub_srv_log_diagnostic_data`), indicating
> issues with the Display Core Microcontroller Unit.
> - `INFO: task kworker/uXX:X:XXXX blocked for more than 120 seconds` messages,
> which are a symptom of the system being hung due to the GPU issues.

This looks like a dupicate of #1106268, so merging both reports.
A bpo version of 6.12.32 will be available soonish.

Regards,
Salvatore

Reply via email to