I had the same super annoying issue. Hardware: - 21HF0021US (ThinkPad P14s Gen 4) - BIOS N3QET51W (1.51)
When I got the machine, I wiped it and installed Ubuntu 24.04 LTS, then later upgraded to Ubuntu 25.10. After that, I started noticing that the machine was rebooting randomly. There were no logs or kernel messages at all. It looked exactly like someone pulled the power plug from the device. At first, I thought it might be a kernel issue. So I tried Fedora 43, and surprisingly the resets dropped significantly, maybe once every two weeks. Then I switched to Arch Linux (rolling). I expected the latest kernel would fix the issue, but sadly the resets came back with a vengeance, happening more than 6 times a day. The battery is healthy (95%), and I am using a 100W charger (I upgraded from 65W hoping it would help), but the resets still happened. Then I checked the throttling status and found that the CPU was getting throttled. I replaced the thermal paste with PTM7950, and the throttling issue is now completely gone. I can stress the CPU for more than 30 minutes with no throttle events. The weird part is that the resets never happen during heavy compiling or other CPU-intensive tasks. They usually happen while watching YouTube. I thought it might be the browser. I switched from Firefox to Zen (same engine), then tried Chrome, but the resets still happened. Then I tried the LTS kernel (6.12.73-1-lts) and noticed these logs: ``` 10:14:37 PM kernel: i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: CT: Failed to process CT message (-ENOKEY) 01 00 36 9d 00 01 00 e0 10:14:37 PM kernel: i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: CT: Failed to process CT message (-ENOKEY) 01 00 36 9d 00 01 00 e0 10:14:37 PM kernel: i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: CT: Failed to handle HXG message (-ENOKEY) 00 01 00 e0 10:14:37 PM kernel: i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: CT: Unsolicited response message: len 1, data 0xe0000100 (fence 40246, last 40249) ``` I had never seen these before because the latest/zen kernels (6.18.9) would just reboot instantly with no time to write logs to disk. The LTS kernel (6.12.73) luckily left the logs without crashing. After searching the internet and asking ai, it appears to be a synchronization failure between the i915 driver and Intel GuC firmware (a CT protocol mismatch or lost state). After that, I tried disabling GuC submission by adding `i915.enable_guc=2` to my systemd-boot options for the latest kernel. However, the reboots still happened. Then I found that PSR power saving can trigger the same race condition, so I also added `i915.enable_psr=0`. After rebooting, I had zero resets for a while using the latest kernel. However, after about a week of use, a new release of `linux-firmware` and the LTS kernel was installed (`20260221-1`, `6.18.16` respectively), and the reboots came back again. Since all the issues seemed related to the `i915` driver, I thought that maybe switching to the `xe` driver would help. So I blocked `i915` and forced `xe` to take the device by adding: `i915.force_probe=!7d55 xe.force_probe=0xa7a0` And it worked. No resets for more than a week now. After that, I completely blocked `i915` by adding it to `/etc/modprobe.d/blacklist.conf`: ``` blacklist nouveau blacklist i915 install i915 /bin/false ``` Then I regenerated the initramfs with`sudo mkinitcpio -P` and rebooted. I hope this helps anyone experiencing the same issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084190 Title: Ubuntu 24.04 crashes occasionally on Thinkpad P14s To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/2084190/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
