Public bug reported: BugLink: https://bugs.launchpad.net/bugs/2156858
[ Impact ] We found that setting power_dpm_force_performance_level to high and running heavy workload on Strix Halo platforms somehow makes it reboot. This is due to the throttling limits in amdgpu don't aligned with the power management firmware properly. [ Fix ] Cherry-pick the following commit: - 03b70e0d8aa26bab (drm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range) which has been accepted since mainline v7.1 [ Test ] 1. Boot the kernel 2. Run the same compute workload on a Strix Halo system, and it shouldn't reboot. [ Where problems could occur ] If the driver throttles to aggressively this may impact performance, but this is safer than letting Strix Halo burn. [ Additional Information ] https://github.com/torvalds/linux/commit/03b70e0d8aa26bab89a0f1394c1c80a871925e42 ** Affects: linux-oem-6.17 (Ubuntu) Importance: Undecided Assignee: Leo Lin (0xff07) Status: New ** Summary changed: - [SRU] Setting performance level to high on Strix Halo leads to system reboot + [SRU] Fix GPU throttling on Strix Halo ** Description changed: + BugLink: https://bugs.launchpad.net/bugs/2156858 + [ Impact ] We found that setting power_dpm_force_performance_level to high and running heavy workload on Strix Halo platforms somehow makes it reboot. This is due to the throttling limits in amdgpu don't aligned with the power management firmware properly. - [ Fix ] Cherry-pick the following commit: - 03b70e0d8aa26bab (drm/amd/pm: smu_v14_0_0: use SoftMin for gfxclk in set_soft_freq_limited_range) which has been accepted since mainline v7.1 - [ Test ] 1. Boot the kernel 2. Run the same compute workload on a Strix Halo system, and it shouldn't reboot. - [ Where problems could occur ] If the driver throttles to aggressively this may impact performance, but this is safer than letting Strix Halo burn. - [ Additional Information ] https://github.com/torvalds/linux/commit/03b70e0d8aa26bab89a0f1394c1c80a871925e42 ** Changed in: linux-oem-6.17 (Ubuntu) Assignee: (unassigned) => Leo Lin (0xff07) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2156858 Title: [SRU] Fix GPU throttling on Strix Halo To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.17/+bug/2156858/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
