From: Honglei Huang <[email protected]>
V7 of the SVM patch series for amdgpu based on the drm_gpusvm framework.
This revision rebases to the latest amdgpu driver, fixes read-only VMA
handling in GPU mapping, simplifies attribute change trigger logic for
XNACK on mode, and adapts the VM fault path to the latest amdgpu_vm code.
This patch series implements SVM support with the following design:
1. Attributes separated from physical page management.
2. GPU fault driven mapping (XNACK on).
3. MMU notifier invalidation.
4. Garbage collector workqueue.
Changes since V6:
- Handle -EPERM from drm_gpusvm_range_find_or_insert() by retrying
with read_only=true to respect RO VMA permissions.
- Reset read_only flag per iteration in the mapping loop.
- Simplify attr change trigger dispatch: remove XNACK off paths,
only PREFETCH drives mapping in XNACK on mode.
- Change Kconfig default from y to n.
- Adapt VM fault dispatch to latest amdgpu_vm.c with exclusive
SVM/KFD path selection.
- Fix coding style: line length, block comment alignment, spacing.
- Remove inline comments from ATTR_BIT defines.
- Use max_t/min_t for mixed-type granularity alignment in fault
handler.
Changes since V5:
- Get PTE flags per DMA segment, stop caching on svm_range.
- attr_pte_flags() now takes enum drm_interconnect_protocol.
- Add AMDGPU_INTERCONNECT_VRAM / _P2P tags in amdgpu_svm.h.
- Rework zap_ptes to take explicit page range.
- Add amdgpu_svm_range_evict(): devmem-aware wrapper around
drm_gpusvm_range_evict.
- get_pages(): fall back to evict on -EOPNOTSUPP.
- Simplify cover letter design content.
Changes since V4:
- Preserve attributes when unmap, GC only removes GPU ranges.
- struct amdgpu_vm now holds a pointer to struct amdgpu_svm.
- UAPI: Remove AMDGPU_SVM_ATTR_GPU_ALWAYS_MAPPED.
- UAPI: Add AMDGPU_SVM_OP_RESET_ATTR to reset attributes to defaults.
- Add amdgpu_svm_lock()/unlock()/assert_locked() inline wrappers.
- Add amdgpu_svm_attr_check_vm_bo() for SVM/BO overlap detection.
- Refactor attr change model.
- Remove ATTR_ONLY/RANGE_SPLIT triggers.
Changes since V3:
- UAPI: Merge ACCESS/ACCESS_IN_PLACE/NO_ACCESS into a single
AMDGPU_SVM_ATTR_ACCESS attribute with enum amdgpu_ioctl_svm_access
(INACCESSIBLE/IN_PLACE/ALLOW_MIGRATE).
- UAPI: Replaced SET_FLAGS/CLR_FLAGS with per-flag boolean attribute
types: HOST_ACCESS, COHERENT, EXT_COHERENT, HIVE_LOCAL, GPU_RO,
GPU_EXEC, GPU_READ_MOSTLY, GPU_ALWAYS_MAPPED.
- UAPI: Replaced all #define constants with C enums.
- Add UAPI documentation with kerneldoc comments in amdgpu_drm.h.
- Moved flag bits from UAPI AMDGPU_SVM_FLAG_* to internal
AMDGPU_SVM_ATTR_BIT_* bitmap. Added attr_flag_type_to_bit() helper.
- Removed AMDGPU_SVM_VALID_FLAG_MASK and flags_or.
- Condensed commit messages and cover letter content.
Changes since V2:
- Add version tittle in commit message.
- Fix some content mistaken.
Changes since V1:
- Added GPU fault handler (amdgpu_svm_handle_fault) with PASID-based
SVM lookup: GC -> find_or_insert -> get_pages -> GPU map.
- Removed restore worker queue; GPU fault recreates ranges on demand.
GC simplified to discard-only, no rebuild/restore logic.
- Reworked MMU notifier to two-phase model: event_begin() zaps PTEs
and flushes TLB, event_end() unmaps DMA and queues UNMAP to GC.
Removed begin_restore/end_restore and NOTIFIER flag dispatch.
- Added invalidate_interval() for attribute boundary realignment
when sub-region attribute changes cross existing GPU ranges.
- On MMU_NOTIFY_UNMAP, discard all affected ranges entirely;
attribute layer preserves valid attrs, fault recreates on demand.
- Added unregistered address attribute derivation from VMA properties
for ROCm compatibility (kfd/rocr/hip tests).
- Dropped XNACK off support; returns -EOPNOTSUPP when disabled.
Removed kgd2kfd_quiesce_mm()/resume_mm() dependency.
- Added TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH change triggers.
- Added helpers: find_locked, get_bounds_locked, set_default.
TODO:
- Add multi GPU support.
- Add XNACK off mode.
Related work:
- SVM migration and prefetch support is being developed in:
https://lore.kernel.org/amd-gfx/[email protected]/
- ROCm UMD interface adaptation for the new drm SVM API is being developed in:
https://github.com/ROCm/rocm-systems/pull/4364
Test results:
Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
- KFD test: 95%+ passed.
- ROCR test: all passed.
- HIP catch test: gfx943 (MI300X): 96% passed.
gfx906 (MI60): 99% passed.
Patch overview:
01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl with SET/GET/RESET_ATTR ops,
access modes, location values and attribute types.
02/12 Core header: amdgpu_svm wrapping drm_gpusvm with attr_tree,
GC, locks and VM integration hooks.
03/12 Attribute types: attrs, attr_range, attr_tree, internal flag
bitmap, change triggers and inline helpers.
04/12 Attribute tree ops: interval tree lookup/insert/remove,
find_locked, get_bounds_locked, set_default.
05/12 Attribute set/get/clear/reset: validate, apply with head/tail
splitting, BO overlap check, change propagation.
06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range with
gpu_mapped state, pending ops and interconnect tags.
07/12 Range GPU mapping: per-segment, protocol-aware PTE flags;
zap_ptes on explicit page window; get_pages evict fallback;
coalesced DMA segments programmed under notifier lock;
-EPERM retry with read_only for RO VMA enforcement.
08/12 Notifier and GC helpers: two-phase events, range removal,
GC enqueue/dequeue, invalidate_interval with devmem eviction
to preserve VRAM data on boundary crossing.
09/12 Notifier invalidate callback: dispatch with TLB flush
batching and checkpoint timestamp.
10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init,
XNACK detection, GC init, PASID lookup, TLB flush and
init/close/fini.
11/12 Ioctl entry and fault handler: op dispatcher, GC worker and
fault path (attrs + read_only, BO overlap narrowing,
unregistered attr derivation, retry logic).
12/12 Build integration: Kconfig, Makefile, ioctl registration and
amdgpu_vm fault dispatch.
Honglei Huang (12):
drm/amdgpu: add SVM ioctl UAPI definitions
drm/amdgpu: add SVM core header and VM integration
drm/amdgpu: add SVM attribute subsystem types
drm/amdgpu: implement SVM attribute tree and helper functions
drm/amdgpu: implement SVM attribute set/get/clear operations
drm/amdgpu: add SVM range types and work queue interface
drm/amdgpu: implement SVM range GPU mapping core
drm/amdgpu: implement SVM range notifier and GC helpers
drm/amdgpu: add SVM notifier invalidate callback and checkpoint
drm/amdgpu: implement SVM initialization and lifecycle
drm/amdgpu: add SVM ioctl entry and fault handler module
drm/amdgpu: integrate SVM into build system and VM fault path
drivers/gpu/drm/amd/amdgpu/Kconfig | 10 +
drivers/gpu/drm/amd/amdgpu/Makefile | 13 +
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c | 682 ++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h | 199 ++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c | 988 ++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h | 174 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 386 +++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h | 39 +
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 774 ++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 166 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 4 +
include/uapi/drm/amdgpu_drm.h | 107 ++
14 files changed, 3565 insertions(+), 2 deletions(-)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h
--
2.34.1