Public bug reported:

This is a tracking bug for inclusion of the upstream Linux kernel commit
"drm/amdkfd: relax checks for over allocation of save area" into [Q/R]
linux packages.

SRU Justification:

[Impact]

This will fix issues with certain AMD Strix Halo APUs running into hangs
when paired with particular versions of AMD ROCm.

[Fix]

Include commit d15deafab5d722afb9e2f83c5edcdef9d9d98bd1 from the
upstream mainline linux kernel into our generic kernels.

[Test Plan]

The reproducer script below should be sufficient to confirm the bug is
no longer present:

export PYTORCH_ROCM_ARCH=gfx1151
export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
uv init --python 3.13
uv add --index therock_nightly=https://rocm.nightlies.amd.com/v2/gfx1151/ 
--index-strategy unsafe-best-match --prerelease allow 
"rocm-sdk-core==7.10.0a20251015" "rocm[libraries]==7.10.0a20251015" 
"torch==2.10.0a0+rocm7.10.0a20251015" 
"torchvision==0.25.0a0+rocm7.10.0a20251015" 
"torchaudio==2.8.0a0+rocm7.10.0a20251015" 
"pytorch-triton-rocm==3.5.0+gitb0cf18f2.rocm7.10.0a20251015"
uv add git+https://github.com/huggingface/diffusers.git
uv add git+https://github.com/ivanfioravanti/qwen-image-mps.git
uv run qwen-image-mps edit -i mushroom1.png mouse1.png -p "The mouse is under 
the mushroom."

(source:
https://github.com/ROCm/ROCm/issues/5590#issuecomment-3481580910)


We don't have any Strix Halo hardware to test this fix on, so there currently 
is no test plan. I am currently investigating some Kraken Point APUs that may 
be suitable, if I am able to provision them with a Questing kernel. Currently, 
because they install the OEM kernel, they always install noble. WIP.

[Where problems could occur]

This could introduce regressions for folks utilizing Strix Halo APUs
with ROCm. They may experience issues with version desync, as this patch
is intended to be paired with a user-space update of ROCm. (See
https://github.com/ROCm/rocm-
systems/commit/770f30bc4c72d763742e39932e2c0583813d531f for details on
the exact version.)

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Benjamin Wheeler (benjaminwheeler)
         Status: New

** Affects: linux (Ubuntu Questing)
     Importance: Undecided
     Assignee: Benjamin Wheeler (benjaminwheeler)
         Status: New

** Affects: linux (Ubuntu Resolute)
     Importance: Undecided
     Assignee: Benjamin Wheeler (benjaminwheeler)
         Status: New

** Also affects: linux (Ubuntu Questing)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Resolute)
   Importance: Undecided
     Assignee: Benjamin Wheeler (benjaminwheeler)
       Status: New

** Changed in: linux (Ubuntu Questing)
     Assignee: (unassigned) => Benjamin Wheeler (benjaminwheeler)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2133740

Title:
  drm/amdkfd: relax checks for over allocation of save area

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2133740/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to