Public bug reported:

[SRU Justification]

[Impact]

While running Pytorch with some models that compute takes a long time
there are some hangs that can be observed on Strix Halo.

The hangs occur in the MES scheduler. A workaround has been developed
for this situation in the MES scheduler and in the GPU driver.

[Fix]
The MES scheduler change is in MES 0x7f for GFX11 products and 0x82 in GFX12 
products.

The kernel change is:
https://lore.kernel.org/amd-gfx/[email protected]/

The kernel change will only be enabled if new enough MES scheduler
microcode is installed.

[Test Case]
Run pytorch, ensure that system doesn't hang.
Run some games in steam, ensure system doesn't hang.

[Where problems can go wrong]
The workaround applies to all jobs sent to MES scheduler.  It will be localized 
to GFX11 and GFX12 machines.

If there was a problem from this change it could manifest as a hang on
system.

** Affects: linux-oem-6.14 (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2125201

Title:
  [noble] Fix system hang observed with comfy-ui

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oem-6.14/+bug/2125201/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to