** Also affects: linux (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.14 (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.14 (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Noble)
       Status: New => Won't Fix

** Changed in: linux-oem-6.14 (Ubuntu Questing)
       Status: New => Invalid

** Description changed:

  [SRU Justification]
  
  [Impact]
  
  While running Pytorch with some models that compute takes a long time
  there are some hangs that can be observed on Strix Halo.
  
  The hangs occur in the MES scheduler. A workaround has been developed
  for this situation in the MES scheduler and in the GPU driver.
  
  [Fix]
  The MES scheduler change is in MES 0x7f for GFX11 products and 0x82 in GFX12 
products.
  
- The kernel change is:
+ * The kernel change is:
  https://lore.kernel.org/amd-gfx/20250919004800.125555-1-supe...@kernel.org/
+ * For OEM 6.14 this also has a dependency on 
https://git.kernel.org/torvalds/c/15d8c92f107c1 to cleanly backport.
  
  The kernel change will only be enabled if new enough MES scheduler
  microcode is installed.
  
  [Test Case]
  Run pytorch, ensure that system doesn't hang.
  Run some games in steam, ensure system doesn't hang.
  
  [Where problems can go wrong]
  The workaround applies to all jobs sent to MES scheduler.  It will be 
localized to GFX11 and GFX12 machines.
  
  If there was a problem from this change it could manifest as a hang on
  system.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2125201

Title:
  [noble] Fix system hang observed with comfy-ui

Status in linux package in Ubuntu:
  New
Status in linux-oem-6.14 package in Ubuntu:
  Invalid
Status in linux source package in Noble:
  Won't Fix
Status in linux-oem-6.14 source package in Noble:
  New
Status in linux source package in Questing:
  New
Status in linux-oem-6.14 source package in Questing:
  Invalid

Bug description:
  [SRU Justification]

  [Impact]

  While running Pytorch with some models that compute takes a long time
  there are some hangs that can be observed on Strix Halo.

  The hangs occur in the MES scheduler. A workaround has been developed
  for this situation in the MES scheduler and in the GPU driver.

  [Fix]
  The MES scheduler change is in MES 0x7f for GFX11 products and 0x82 in GFX12 
products.

  * The kernel change is:
  https://lore.kernel.org/amd-gfx/20250919004800.125555-1-supe...@kernel.org/
  * For OEM 6.14 this also has a dependency on 
https://git.kernel.org/torvalds/c/15d8c92f107c1 to cleanly backport.

  The kernel change will only be enabled if new enough MES scheduler
  microcode is installed.

  [Test Case]
  Run pytorch, ensure that system doesn't hang.
  Run some games in steam, ensure system doesn't hang.

  [Where problems can go wrong]
  The workaround applies to all jobs sent to MES scheduler.  It will be 
localized to GFX11 and GFX12 machines.

  If there was a problem from this change it could manifest as a hang on
  system.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2125201/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to