gpusvm: split MM and device state across

Huang, Honglei Wed, 10 Jun 2026 01:49:50 -0700



On 6/10/2026 11:44 AM, Matthew Brost wrote:

On Wed, Jun 03, 2026 at 02:56:15PM +0800, Honglei Huang wrote:

From: Honglei Huang <[email protected]>

The intent of this series is to make drm_gpusvm more flexible and
give drivers more freedom over how they assemble the MM related and device
side operations.

This RFC implements the direction Matt suggested in [1]:

   - Move struct drm_gpusvm_pages out of struct drm_gpusvm_range.
   - Embed either a struct device or a struct drm_device in struct
     drm_gpusvm_pages.
   - Drop struct drm_device from struct drm_gpusvm.
   - Have the driver's range structure embed one or more struct
     drm_gpusvm_pages in addition to struct drm_gpusvm_range.
   - Refactor a few range-based helpers (drm_gpusvm_range_pages_valid,
     drm_gpusvm_range_get_pages, drm_gpusvm_range_unmap_pages), or
     simply drop them entirely and update drivers to use the
     drm_gpusvm_pages helpers instead.


Overall this looks good - thanks doing this.

In essence the series does only two abstractions, plus the xe
adaptation that follows from them:

   - range vs pages: split drm_gpusvm_range (MM / VA range state) from
     drm_gpusvm_pages (device physical related), so the two
     sides can have independent lifetimes and ownership.
   - drm_gpusvm vs drm_device: make drm_gpusvm pure MM level and push
     the device side down onto drm_gpusvm_pages, which is where DMA
     actually happens.
   - xe is updated to fit the modifications, no functional change intended.

If such changes are acceptable in terms of direction, I have a few questions:

  - Drivers now own drm_gpusvm_pages unmap / free and notifier_seq init.
    OK to push this fully to drivers, or should some new mechanisms need to add
    to ensure functions can be completed by the framework?


I'm looking at the diff of xe_svm.c before / after and I see
drm_gpusvm_free_pages moved to xe_svm_range_free. That looks fine to me.

I see in xe_svm_range_alloc() this:

range->pages.notifier_seq = LONG_MAX;

Can we make help like drm_gpusvm_init_pages which does this? I think it
is better to encapsulate the pages init into normalized helper even
though it is very simple. Maybe an inline since this just a single line
of code?


Got it will add helper drm_gpusvm_init_pages in next version.

  - This series drops the three drm_gpusvm_range_* helpers and changes
    drm_gpusvm_get_pages() / drm_gpusvm_init() signatures.
    Do we need to keep thin wrappers for backward compatibility.


It should be safe to drop these helpers.


Got it.

  - drm_gpusvm_get_pages() mixes HMM fault and device DMA map. Multi device 
under
    one SVM calls would repeat the HMM fault. Does it need to modified to Split
    into MM level fault + per pages DMA map?


Hmm, this might get a little tricky because of how the allocation/retry
loop is implemented in get_pages(). Maybe we could change the function
to accept an array of pages plus a count? I’m not sure what the best
approach is here, but I’m open to ideas. That said, I’d rather avoid
having the driver open-code a retry loop if it could live in common
code.

Side note: another modification we need in get_pages() is to make the
DMA-mapping step optional. I suggested that AMDXDNA use GPU SVM for
userptr, and I don’t believe that device requires DMA mapping.

Got it, will try to change the get_pages() to accept array of pages anda count, and see if it will work well.

For the no DMA mapping condition, maybe a flag like no_dma_mapping canbe added in drm_gpusvm_ctx or someplace else.


Regards,
Honglei

Patch overview:

   1/5 gpusvm: split MM state flags onto drm_gpusvm_range_flags.
   2/5 gpusvm: embed drm_device into drm_gpusvm_pages; DMA goes
       through it.
   3/5 xe: xe_svm_range owns its drm_gpusvm_pages and its lifecycle.
   4/5 gpusvm: drop pages from drm_gpusvm_range and the range-level
       wrappers.
   5/5 gpusvm: drop drm_device from drm_gpusvm.

tests:
AMDGPU:
   based on amdgpu adaptation patch in [2], but still SVM:DRM = 1:1,
   1:n is on going needs many modifications and testings.

   Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on/off:
   - KFD test: 95%+ passed.
   - ROCR test: all passed.
   - HIP catch test: gfx943 (MI300X): 96% passed.
                     gfx906 (MI60): 99% passed.
INTEL XE:
   TODO: We bought some Intel Arc A380, but it seems like this cards
   don't support hardware fault / SVM, waiting for the new
   cards B580/B570 to arrive.


Please send patches that modify GPU SVM or Xe to the Xe mailing list. We
have public CI, which I believe can be triggered by any AMD email
address.

I just pulled the code, encountered a compile error, and noticed a bug
around unmapping related to that error. I put together some quick fixes
on top of the series here [3], and locally all of our tests seem to be
passing.

I’ll reply in detail to the patches shortly, explaining some of the
reasoning behind these changes.

Matt

[3] 
https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-svn-perf-6-15-2025/-/commit/623f6a50c037d9e44f6c9fbe6859a0ba7ad50177

links:
[1] 
https://lore.kernel.org/amd-gfx/acRgr7QwdULsn6G2@gsse-cloud1/#:~:text=I%20think%20roughly,drm_gpusvm_pages%0A%20%20helpers%20instead.
[2] 
https://lore.kernel.org/amd-gfx/[email protected]/
Honglei Huang (5):
   drm/gpusvm: split MM state flags out of drm_gpusvm_pages_flags
   drm/gpusvm: embed struct drm_device into drm_gpusvm_pages
   drm/xe: have xe_svm_range embed one drm_gpusvm_pages
   drm/gpusvm: move struct drm_gpusvm_pages out of struct
     drm_gpusvm_range
   drm/gpusvm: let the drm_gpusvm core context purely MM level

  drivers/gpu/drm/drm_gpusvm.c    | 128 +++++++++-----------------------
  drivers/gpu/drm/xe/xe_pt.c      |   2 +-
  drivers/gpu/drm/xe/xe_svm.c     |  37 +++++----
  drivers/gpu/drm/xe/xe_svm.h     |  11 ++-
  drivers/gpu/drm/xe/xe_userptr.c |   1 +
  include/drm/drm_gpusvm.h        |  49 ++++++------
  6 files changed, 95 insertions(+), 133 deletions(-)

--
2.34.1

Re: [RFC 0/5] drm/gpusvm: split MM and device state across

Reply via email to