From: Honglei Huang <[email protected]>

V5 of the SVM patch series for amdgpu based on the drm_gpusvm framework.
This revision addresses meeting feedback: keep attributes when unmap,
integrate the amdgpu svm into amdgpu vm structure, drop GPU_ALWAYS_MAPPED,
add RESET_ATTR operation, add lock wrappers.

This patch series implements SVM support with the following design:

  1. Attributes separated from physical page management:

    - Attribute layer (amdgpu_svm_attr_tree): a driver-side interval
      tree storing per-range SVM attributes. Managed through SET_ATTR
      ioctl and preserved across range lifecycle events.

    - Physical page layer (drm_gpusvm ranges): managed by the
      drm_gpusvm framework, representing HMM-backed DMA mappings
      and GPU page table entries.

  2. GPU fault driven mapping (XNACK on):

    The core mapping path is driven by GPU page faults instead of ioctls.
    amdgpu_svm_handle_fault() looks up SVM by PASID, runs GC,
    resolves attributes, then maps via find_or_insert -> get_pages
    -> GPU PTE update. For unregistered addresses, default
    attributes are derived from VMA properties automatically.

  3. MMU notifier invalidation:

    Two-phase callback: event_begin() zaps GPU PTEs and flushes
    TLB, event_end() unmaps DMA pages. UNMAP events queue ranges
    to GC for deferred cleanup. Non-UNMAP events (eviction) rely
    on GPU fault to remap.

  4. Garbage collector:

    GC workqueue processes unmapped ranges: removes them
    from drm_gpusvm. Attributes are preserved across unmap
    and persist until explicitly reset by userspace via
    RESET_ATTR. No rebuild or restore logic, GPU fault
    handles recreation.

Changes since V4:
  - Preserve attributes when unmap, GC only removes GPU ranges.
  - struct amdgpu_vm now holds a pointer to struct amdgpu_svm.
  - UAPI: Remove AMDGPU_SVM_ATTR_GPU_ALWAYS_MAPPED.
  - UAPI: Add AMDGPU_SVM_OP_RESET_ATTR to reset attributes to defaults.
  - Add amdgpu_svm_lock()/unlock()/assert_locked() inline wrappers.
  - Add amdgpu_svm_attr_check_vm_bo() for SVM/BO overlap detection.
  - Refactor attr change model.
  - Remove ATTR_ONLY/RANGE_SPLIT triggers.

Changes since V3:
  - UAPI: Merge ACCESS/ACCESS_IN_PLACE/NO_ACCESS into a single
    AMDGPU_SVM_ATTR_ACCESS attribute with enum amdgpu_ioctl_svm_access
    (INACCESSIBLE/IN_PLACE/ALLOW_MIGRATE).
  - UAPI: Replaced SET_FLAGS/CLR_FLAGS with per-flag boolean attribute
    types: HOST_ACCESS, COHERENT, EXT_COHERENT, HIVE_LOCAL, GPU_RO,
    GPU_EXEC, GPU_READ_MOSTLY, GPU_ALWAYS_MAPPED.
  - UAPI: Replaced all #define constants with C enums.
  - Add UAPI documentation with kerneldoc comments in amdgpu_drm.h.
  - Moved flag bits from UAPI AMDGPU_SVM_FLAG_* to internal
    AMDGPU_SVM_ATTR_BIT_* bitmap. Added attr_flag_type_to_bit() helper.
  - Removed AMDGPU_SVM_VALID_FLAG_MASK and flags_or.
  - Condensed commit messages and cover letter content.

Changes since V2:
  - Add version tittle in commit message.
  - Fix some content mistaken.

Changes since V1:
  - Added GPU fault handler (amdgpu_svm_handle_fault) with PASID-based
    SVM lookup: GC -> find_or_insert -> get_pages -> GPU map.
  - Removed restore worker queue; GPU fault recreates ranges on demand.
    GC simplified to discard-only, no rebuild/restore logic.
  - Reworked MMU notifier to two-phase model: event_begin() zaps PTEs
    and flushes TLB, event_end() unmaps DMA and queues UNMAP to GC.
    Removed begin_restore/end_restore and NOTIFIER flag dispatch.
  - Added invalidate_interval() for attribute boundary realignment
    when sub-region attribute changes cross existing GPU ranges.
  - On MMU_NOTIFY_UNMAP, discard all affected ranges entirely;
    attribute layer preserves valid attrs, fault recreates on demand.
  - Added unregistered address attribute derivation from VMA properties
    for ROCm compatibility (kfd/rocr/hip tests).
  - Dropped XNACK off support; returns -EOPNOTSUPP when disabled.
    Removed kgd2kfd_quiesce_mm()/resume_mm() dependency.
  - Added TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH change triggers.
  - Added helpers: find_locked, get_bounds_locked, set_default.

TODO:
  - Add multi GPU support.
  - Add XNACK off mode.

Related work:
  - SVM migration and prefetch support is being developed in:
    https://lore.kernel.org/amd-gfx/[email protected]/
  - ROCm UMD interface adaptation for the new drm SVM API is being developed in:
    https://github.com/ROCm/rocm-systems/pull/4364


Test results:
  Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
  - KFD test: 95%+ passed.
  - ROCR test: all passed.
  - HIP catch test: gfx943 (MI300X): 96% passed.
                    gfx906 (MI60):99% passed.

Patch overview:

  01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl, enum-based SVM operations
        (SET_ATTR/GET_ATTR/RESET_ATTR), access modes, location values,
        attribute types with kerneldoc in amdgpu_drm.h.

  02/12 Core header: amdgpu_svm wrapping drm_gpusvm with refcount,
        attr_tree, GC struct, locks, VM integration hooks, and
        lock/unlock/assert_locked inline helpers.

  03/12 Attribute types: amdgpu_svm_attrs, attr_range (interval tree
        node), attr_tree, internal ATTR_BIT flag bitmap, change
        triggers, inline helpers (attr_start/end/has_access).

  04/12 Attribute tree ops: interval tree lookup, insert, remove,
        find_locked, get_bounds_locked, set_default, and lifecycle.

  05/12 Attribute set/get/clear/reset: validate UAPI attributes,
        apply to tree with head/tail splitting, BO overlap check,
        change propagation via apply_attr_change, and query.

  06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range
        with gpu_mapped state, pending ops, work queue linkage,
        and op_ctx for batch processing.

  07/12 Range GPU mapping: PTE flags computation, GPU page table
        update, range mapping loop, map_attrs public API.

  08/12 Notifier and GC helpers: two-phase notifier events, range
        removal, GC enqueue/add, dequeue helpers, invalidate_interval.

  09/12 Notifier invalidate callback: drm_gpusvm_ops.invalidate
        dispatch with TLB flush batching, checkpoint timestamp.

  10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init
        with chunk sizes (2M/64K/4K), XNACK detection, GC init,
        PASID lookup, TLB flush, attr_change_trigger computation,
        centralized apply_attr_change, and init/close/fini lifecycle.

  11/12 Ioctl entry and fault handler: ioctl dispatcher
        (op_set_attr/op_get_attr/op_reset_attr), GC worker, and
        amdgpu_svm_fault.c with full fault path including BO overlap
        narrowing, unregistered attribute derivation, and retry logic.

  12/12 Build integration: Kconfig (CONFIG_DRM_AMDGPU_SVM), Makefile
        rules, ioctl registration, and amdgpu_vm fault dispatch.

Honglei Huang (12):
  drm/amdgpu: add SVM ioctl UAPI definitions
  drm/amdgpu: add SVM core header and VM integration
  drm/amdgpu: add SVM attribute subsystem types
  drm/amdgpu: implement SVM attribute tree and helper functions
  drm/amdgpu: implement SVM attribute set/get/clear operations
  drm/amdgpu: add SVM range types and work queue interface
  drm/amdgpu: implement SVM range GPU mapping core
  drm/amdgpu: implement SVM range notifier and GC helpers
  drm/amdgpu: add SVM notifier invalidate callback and checkpoint
  drm/amdgpu: implement SVM initialization and lifecycle
  drm/amdgpu: add SVM ioctl entry and fault handler module
  drm/amdgpu: integrate SVM into build system and VM fault path

 drivers/gpu/drm/amd/amdgpu/Kconfig            |  11 +
 drivers/gpu/drm/amd/amdgpu/Makefile           |  13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c       | 585 +++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h       | 183 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c  | 986 ++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h  | 174 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 386 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h |  39 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 787 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 148 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  20 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   4 +
 include/uapi/drm/amdgpu_drm.h                 | 106 ++
 14 files changed, 3443 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h

-- 
2.34.1

Reply via email to