Hi,

On 30/1/26 22:10, Jordan Niethe wrote:
The existing design of device private memory imposes limitations which
render it non functional for certain systems and configurations where
the physical address space is limited.

Device private memory is implemented by first reserving a region of the
physical address space. This is a problem. The physical address space is
not a resource that is directly under the kernel's control. Availability
of suitable physical address space is constrained by the underlying
hardware and firmware and may not always be available.

Device private memory assumes that it will be able to reserve a device
memory sized chunk of physical address space. However, there is nothing
guaranteeing that this will succeed, and there a number of factors that
increase the likelihood of failure. We need to consider what else may
exist in the physical address space. It is observed that certain VM
configurations place very large PCI windows immediately after RAM. Large
enough that there is no physical address space available at all for
device private memory. This is more likely to occur on 43 bit physical
width systems which have less physical address space.

Instead of using the physical address space, introduce a device private
address space and allocate devices regions from there to represent the
device private pages.

Introduce a new interface memremap_device_private_pagemap() that
allocates a requested amount of device private address space and creates
the necessary device private pages.

To support this new interface, struct dev_pagemap needs some changes:

   - Add a new dev_pagemap::nr_pages field as an input parameter.
   - Add a new dev_pagemap::pages array to store the device
     private pages.

When using memremap_device_private_pagemap(), rather then passing in
dev_pagemap::ranges[dev_pagemap::nr_ranges] of physical address space to
be remapped, dev_pagemap::nr_ranges will always be 1, and the device
private range that is reserved is returned in dev_pagemap::range.

Forbid calling memremap_pages() with dev_pagemap::ranges::type =
MEMORY_DEVICE_PRIVATE.

Represent this device private address space using a new
device_private_pgmap_tree maple tree. This tree maps a given device
private address to a struct dev_pagemap, where a specific device private
page may then be looked up in that dev_pagemap::pages array.

Device private address space can be reclaimed and the assoicated device
private pages freed using the corresponding new
memunmap_device_private_pagemap() interface.

Because the device private pages now live outside the physical address
space, they no longer have a normal PFN. This means that page_to_pfn(),
et al. are no longer meaningful.

Introduce helpers:

   - device_private_page_to_offset()
   - device_private_folio_to_offset()

to take a given device private page / folio and return its offset within
the device private address space.

Update the places where we previously converted a device private page to
a PFN to use these new helpers. When we encounter a device private
offset, instead of looking up its page within the pagemap use
device_private_offset_to_page() instead.

Update the existing users:

  - lib/test_hmm.c
  - ppc ultravisor
  - drm/amd/amdkfd
  - gpu/drm/xe
  - gpu/drm/nouveau

to use the new memremap_device_private_pagemap() interface.

Acked-by: Felix Kuehling <[email protected]>
Reviewed-by: Zi Yan <[email protected]> # for MM changes
Signed-off-by: Jordan Niethe <[email protected]>
Signed-off-by: Alistair Popple <[email protected]>

---

Hi Matthew - the latest intel-xe CI run is passing: 
https://patchwork.freedesktop.org/series/159738/#rev8

Would you mind reviewing the gpu/drm/xe changes here and in:
- [PATCH v5 01/13] mm/migrate_device: Introduce migrate_pfn_from_page() helper
- [PATCH v5 03/13] mm/migrate_device: Make migrate_device_{pfns,range}() take 
mpfns     
- [PATCH v5 04/13] mm/migrate_device: Add migrate PFN flag to track device 
private pages

Thanks very much,
Jordan.



Reply via email to