From: Honglei Huang <[email protected]>
V5 of the SVM patch series for amdgpu based on the drm_gpusvm framework.
This revision addresses meeting feedback: keep attributes when unmap,
integrate the amdgpu svm into amdgpu vm structure, drop GPU_ALWAYS_MAPPED,
add RESET_ATTR operation, add lock wrappers.
This patch series implements SVM support with the following design:
1. Attributes separated from physical page management:
- Attribute layer (amdgpu_svm_attr_tree): a driver-side interval
tree storing per-range SVM attributes. Managed through SET_ATTR
ioctl and preserved across range lifecycle events.
- Physical page layer (drm_gpusvm ranges): managed by the
drm_gpusvm framework, representing HMM-backed DMA mappings
and GPU page table entries.
2. GPU fault driven mapping (XNACK on):
The core mapping path is driven by GPU page faults instead of ioctls.
amdgpu_svm_handle_fault() looks up SVM by PASID, runs GC,
resolves attributes, then maps via find_or_insert -> get_pages
-> GPU PTE update. For unregistered addresses, default
attributes are derived from VMA properties automatically.
3. MMU notifier invalidation:
Two-phase callback: event_begin() zaps GPU PTEs and flushes
TLB, event_end() unmaps DMA pages. UNMAP events queue ranges
to GC for deferred cleanup. Non-UNMAP events (eviction) rely
on GPU fault to remap.
4. Garbage collector:
GC workqueue processes unmapped ranges: removes them
from drm_gpusvm. Attributes are preserved across unmap
and persist until explicitly reset by userspace via
RESET_ATTR. No rebuild or restore logic, GPU fault
handles recreation.
Changes since V4:
- Preserve attributes when unmap, GC only removes GPU ranges.
- struct amdgpu_vm now holds a pointer to struct amdgpu_svm.
- UAPI: Remove AMDGPU_SVM_ATTR_GPU_ALWAYS_MAPPED.
- UAPI: Add AMDGPU_SVM_OP_RESET_ATTR to reset attributes to defaults.
- Add amdgpu_svm_lock()/unlock()/assert_locked() inline wrappers.
- Add amdgpu_svm_attr_check_vm_bo() for SVM/BO overlap detection.
- Refactor attr change model.
- Remove ATTR_ONLY/RANGE_SPLIT triggers.
Changes since V3:
- UAPI: Merge ACCESS/ACCESS_IN_PLACE/NO_ACCESS into a single
AMDGPU_SVM_ATTR_ACCESS attribute with enum amdgpu_ioctl_svm_access
(INACCESSIBLE/IN_PLACE/ALLOW_MIGRATE).
- UAPI: Replaced SET_FLAGS/CLR_FLAGS with per-flag boolean attribute
types: HOST_ACCESS, COHERENT, EXT_COHERENT, HIVE_LOCAL, GPU_RO,
GPU_EXEC, GPU_READ_MOSTLY, GPU_ALWAYS_MAPPED.
- UAPI: Replaced all #define constants with C enums.
- Add UAPI documentation with kerneldoc comments in amdgpu_drm.h.
- Moved flag bits from UAPI AMDGPU_SVM_FLAG_* to internal
AMDGPU_SVM_ATTR_BIT_* bitmap. Added attr_flag_type_to_bit() helper.
- Removed AMDGPU_SVM_VALID_FLAG_MASK and flags_or.
- Condensed commit messages and cover letter content.
Changes since V2:
- Add version tittle in commit message.
- Fix some content mistaken.
Changes since V1:
- Added GPU fault handler (amdgpu_svm_handle_fault) with PASID-based
SVM lookup: GC -> find_or_insert -> get_pages -> GPU map.
- Removed restore worker queue; GPU fault recreates ranges on demand.
GC simplified to discard-only, no rebuild/restore logic.
- Reworked MMU notifier to two-phase model: event_begin() zaps PTEs
and flushes TLB, event_end() unmaps DMA and queues UNMAP to GC.
Removed begin_restore/end_restore and NOTIFIER flag dispatch.
- Added invalidate_interval() for attribute boundary realignment
when sub-region attribute changes cross existing GPU ranges.
- On MMU_NOTIFY_UNMAP, discard all affected ranges entirely;
attribute layer preserves valid attrs, fault recreates on demand.
- Added unregistered address attribute derivation from VMA properties
for ROCm compatibility (kfd/rocr/hip tests).
- Dropped XNACK off support; returns -EOPNOTSUPP when disabled.
Removed kgd2kfd_quiesce_mm()/resume_mm() dependency.
- Added TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH change triggers.
- Added helpers: find_locked, get_bounds_locked, set_default.
TODO:
- Add multi GPU support.
- Add XNACK off mode.
Related work:
- SVM migration and prefetch support is being developed in:
https://lore.kernel.org/amd-gfx/[email protected]/
- ROCm UMD interface adaptation for the new drm SVM API is being developed in:
https://github.com/ROCm/rocm-systems/pull/4364
Test results:
Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
- KFD test: 95%+ passed.
- ROCR test: all passed.
- HIP catch test: gfx943 (MI300X): 96% passed.
gfx906 (MI60):99% passed.
Patch overview:
01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl, enum-based SVM operations
(SET_ATTR/GET_ATTR/RESET_ATTR), access modes, location values,
attribute types with kerneldoc in amdgpu_drm.h.
02/12 Core header: amdgpu_svm wrapping drm_gpusvm with refcount,
attr_tree, GC struct, locks, VM integration hooks, and
lock/unlock/assert_locked inline helpers.
03/12 Attribute types: amdgpu_svm_attrs, attr_range (interval tree
node), attr_tree, internal ATTR_BIT flag bitmap, change
triggers, inline helpers (attr_start/end/has_access).
04/12 Attribute tree ops: interval tree lookup, insert, remove,
find_locked, get_bounds_locked, set_default, and lifecycle.
05/12 Attribute set/get/clear/reset: validate UAPI attributes,
apply to tree with head/tail splitting, BO overlap check,
change propagation via apply_attr_change, and query.
06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range
with gpu_mapped state, pending ops, work queue linkage,
and op_ctx for batch processing.
07/12 Range GPU mapping: PTE flags computation, GPU page table
update, range mapping loop, map_attrs public API.
08/12 Notifier and GC helpers: two-phase notifier events, range
removal, GC enqueue/add, dequeue helpers, invalidate_interval.
09/12 Notifier invalidate callback: drm_gpusvm_ops.invalidate
dispatch with TLB flush batching, checkpoint timestamp.
10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init
with chunk sizes (2M/64K/4K), XNACK detection, GC init,
PASID lookup, TLB flush, attr_change_trigger computation,
centralized apply_attr_change, and init/close/fini lifecycle.
11/12 Ioctl entry and fault handler: ioctl dispatcher
(op_set_attr/op_get_attr/op_reset_attr), GC worker, and
amdgpu_svm_fault.c with full fault path including BO overlap
narrowing, unregistered attribute derivation, and retry logic.
12/12 Build integration: Kconfig (CONFIG_DRM_AMDGPU_SVM), Makefile
rules, ioctl registration, and amdgpu_vm fault dispatch.
Honglei Huang (12):
drm/amdgpu: add SVM ioctl UAPI definitions
drm/amdgpu: add SVM core header and VM integration
drm/amdgpu: add SVM attribute subsystem types
drm/amdgpu: implement SVM attribute tree and helper functions
drm/amdgpu: implement SVM attribute set/get/clear operations
drm/amdgpu: add SVM range types and work queue interface
drm/amdgpu: implement SVM range GPU mapping core
drm/amdgpu: implement SVM range notifier and GC helpers
drm/amdgpu: add SVM notifier invalidate callback and checkpoint
drm/amdgpu: implement SVM initialization and lifecycle
drm/amdgpu: add SVM ioctl entry and fault handler module
drm/amdgpu: integrate SVM into build system and VM fault path
drivers/gpu/drm/amd/amdgpu/Kconfig | 11 +
drivers/gpu/drm/amd/amdgpu/Makefile | 13 +
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c | 585 +++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h | 183 ++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c | 986 ++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h | 174 ++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 386 +++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h | 39 +
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 787 ++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 148 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 20 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 4 +
include/uapi/drm/amdgpu_drm.h | 106 ++
14 files changed, 3443 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h
--
2.34.1