From: Honglei Huang <[email protected]>

V6 of the SVM patch series for amdgpu based on the drm_gpusvm framework.
This revision reworks the GPU PTE flag computation to be per DMA segment
and protocol aware, and adds more helpers for notifier and GC.

This patch series implements SVM support with the following design:
  1. Attributes separated from physical page management.
  2. GPU fault driven mapping (XNACK on).
  3. MMU notifier invalidation.
  4. Garbage collector workqueue.

Changes since V5:
  - Get PTE flags per DMA segment, stop caching on svm_range.
  - attr_pte_flags() now takes enum drm_interconnect_protocol.
  - Add AMDGPU_INTERCONNECT_VRAM / _P2P tags in amdgpu_svm.h.
  - Rework zap_ptes to take explicit page range.
  - Add amdgpu_svm_range_evict(): devmem-aware wrapper around
    drm_gpusvm_range_evict.
  - get_pages(): fall back to evict on -EOPNOTSUPP.
  - Simplify cover letter design content.

Changes since V4:
  - Preserve attributes when unmap, GC only removes GPU ranges.
  - struct amdgpu_vm now holds a pointer to struct amdgpu_svm.
  - UAPI: Remove AMDGPU_SVM_ATTR_GPU_ALWAYS_MAPPED.
  - UAPI: Add AMDGPU_SVM_OP_RESET_ATTR to reset attributes to defaults.
  - Add amdgpu_svm_lock()/unlock()/assert_locked() inline wrappers.
  - Add amdgpu_svm_attr_check_vm_bo() for SVM/BO overlap detection.
  - Refactor attr change model.
  - Remove ATTR_ONLY/RANGE_SPLIT triggers.

Changes since V3:
  - UAPI: Merge ACCESS/ACCESS_IN_PLACE/NO_ACCESS into a single
    AMDGPU_SVM_ATTR_ACCESS attribute with enum amdgpu_ioctl_svm_access
    (INACCESSIBLE/IN_PLACE/ALLOW_MIGRATE).
  - UAPI: Replaced SET_FLAGS/CLR_FLAGS with per-flag boolean attribute
    types: HOST_ACCESS, COHERENT, EXT_COHERENT, HIVE_LOCAL, GPU_RO,
    GPU_EXEC, GPU_READ_MOSTLY, GPU_ALWAYS_MAPPED.
  - UAPI: Replaced all #define constants with C enums.
  - Add UAPI documentation with kerneldoc comments in amdgpu_drm.h.
  - Moved flag bits from UAPI AMDGPU_SVM_FLAG_* to internal
    AMDGPU_SVM_ATTR_BIT_* bitmap. Added attr_flag_type_to_bit() helper.
  - Removed AMDGPU_SVM_VALID_FLAG_MASK and flags_or.
  - Condensed commit messages and cover letter content.

Changes since V2:
  - Add version tittle in commit message.
  - Fix some content mistaken.

Changes since V1:
  - Added GPU fault handler (amdgpu_svm_handle_fault) with PASID-based
    SVM lookup: GC -> find_or_insert -> get_pages -> GPU map.
  - Removed restore worker queue; GPU fault recreates ranges on demand.
    GC simplified to discard-only, no rebuild/restore logic.
  - Reworked MMU notifier to two-phase model: event_begin() zaps PTEs
    and flushes TLB, event_end() unmaps DMA and queues UNMAP to GC.
    Removed begin_restore/end_restore and NOTIFIER flag dispatch.
  - Added invalidate_interval() for attribute boundary realignment
    when sub-region attribute changes cross existing GPU ranges.
  - On MMU_NOTIFY_UNMAP, discard all affected ranges entirely;
    attribute layer preserves valid attrs, fault recreates on demand.
  - Added unregistered address attribute derivation from VMA properties
    for ROCm compatibility (kfd/rocr/hip tests).
  - Dropped XNACK off support; returns -EOPNOTSUPP when disabled.
    Removed kgd2kfd_quiesce_mm()/resume_mm() dependency.
  - Added TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH change triggers.
  - Added helpers: find_locked, get_bounds_locked, set_default.

TODO:
  - Add multi GPU support.
  - Add XNACK off mode.

Related work:
  - SVM migration and prefetch support is being developed in:
    https://lore.kernel.org/amd-gfx/[email protected]/
  - ROCm UMD interface adaptation for the new drm SVM API is being developed in:
    https://github.com/ROCm/rocm-systems/pull/4364


Test results:
  Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
  - KFD test: 95%+ passed.
  - ROCR test: all passed.
  - HIP catch test: gfx943 (MI300X): 96% passed.
                    gfx906 (MI60): 99% passed.

Patch overview:
  01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl with SET/GET/RESET_ATTR ops,
        access modes, location values and attribute types.
  02/12 Core header: amdgpu_svm wrapping drm_gpusvm with attr_tree,
        GC, locks and VM integration hooks.
  03/12 Attribute types: attrs, attr_range, attr_tree, internal flag
        bitmap, change triggers and inline helpers.
  04/12 Attribute tree ops: interval tree lookup/insert/remove,
        find_locked, get_bounds_locked, set_default.
  05/12 Attribute set/get/clear/reset: validate, apply with head/tail
        splitting, BO overlap check, change propagation.
  06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range with
        gpu_mapped state, pending ops and interconnect tags.
  07/12 Range GPU mapping: per-segment, protocol-aware PTE flags;
        zap_ptes on explicit page window; get_pages evict fallback;
        coalesced DMA segments programmed under notifier lock.
  08/12 Notifier and GC helpers: two-phase events, range removal,
        GC enqueue/dequeue, invalidate_interval with devmem eviction
        to preserve VRAM data on boundary crossing.
  09/12 Notifier invalidate callback: dispatch with TLB flush
        batching and checkpoint timestamp.
  10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init,
        XNACK detection, GC init, PASID lookup, TLB flush and
        init/close/fini.
  11/12 Ioctl entry and fault handler: op dispatcher, GC worker and
        fault path (attrs + read_only, BO overlap narrowing,
        unregistered attr derivation, retry logic).
  12/12 Build integration: Kconfig, Makefile, ioctl registration and
        amdgpu_vm fault dispatch.

Honglei Huang (12):
  drm/amdgpu: add SVM ioctl UAPI definitions
  drm/amdgpu: add SVM core header and VM integration
  drm/amdgpu: add SVM attribute subsystem types
  drm/amdgpu: implement SVM attribute tree and helper functions
  drm/amdgpu: implement SVM attribute set/get/clear operations
  drm/amdgpu: add SVM range types and work queue interface
  drm/amdgpu: implement SVM range GPU mapping core
  drm/amdgpu: implement SVM range notifier and GC helpers
  drm/amdgpu: add SVM notifier invalidate callback and checkpoint
  drm/amdgpu: implement SVM initialization and lifecycle
  drm/amdgpu: add SVM ioctl entry and fault handler module
  drm/amdgpu: integrate SVM into build system and VM fault path

 drivers/gpu/drm/amd/amdgpu/Kconfig            |  11 +
 drivers/gpu/drm/amd/amdgpu/Makefile           |  13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c       | 702 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h       | 199 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c  | 986 ++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h  | 174 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 382 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h |  39 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 749 +++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 166 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  20 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   4 +
 include/uapi/drm/amdgpu_drm.h                 | 106 ++
 14 files changed, 3552 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h

-- 
2.34.1

Reply via email to