[Kernel-packages] [Bug 2120454] Re: Pytorch reports incorrect GPU memory causing "HIP Out of Memory" errors

Mario Limonciello Wed, 13 Aug 2025 07:51:02 -0700

Questing is already on 6.15, I guess that task can be closed now.

** Changed in: linux (Ubuntu Questing)
       Status: New => Fix Released


-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2120454

Title:
  Pytorch reports incorrect GPU memory causing "HIP Out of Memory"
  errors

Status in linux package in Ubuntu:
  Fix Released
Status in linux-oem-6.14 package in Ubuntu:
  Invalid
Status in linux source package in Noble:
  New
Status in linux-oem-6.14 source package in Noble:
  New
Status in linux source package in Plucky:
  New
Status in linux-oem-6.14 source package in Plucky:
  Invalid
Status in linux source package in Questing:
  Fix Released
Status in linux-oem-6.14 source package in Questing:
  Invalid

Bug description:
  When running PyTorch on an APU it reports wrong amount of memory and
  models can't run.

  torch.OutOfMemoryError: HIP out of memory. Tried to allocate 18.00
  MiB. GPU 0 has a total capacity of 15.60 GiB of which 8.09 MiB is
  free. Of the allocated memory 15.10 GiB is allocated by PyTorch, and
  195.37 MiB is reserved by PyTorch but unallocated. If reserved but
  unallocated memory is large try setting
  PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid
  fragmentation.  See documentation for Memory Management
  (https://pytorch.org/docs/stable/notes/cuda.html#environment-
  variables)

  
  These two commits need to be backported into amdkfd to fix it.

  commit 8b0d068e7dd1 ("drm/amdkfd: add a new flag to manage where VRAM 
allocations go")
  commit 759e764f7d58 ("drm/amdkfd: use GTT for VRAM on APUs only if GTT is 
larger")

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2120454/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 2120454] Re: Pytorch reports incorrect GPU memory causing "HIP Out of Memory" errors

Reply via email to