Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues: - [High] Error handling in `svm_range_set_attr` aborts the transaction mid-flight without rollback, leaving overlapping ranges in the SVM interval tree and leaking memory. - [High] `svm_range_add_notifier_locked` silently ignores failures when registering the MMU interval notifier, leading to WARN_ON_ONCE and interval tree corruption. - [High] Clearing the `KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED` attribute permanently leaks a THP block node and permanently disables THP for the virtual address range. - [High] Dynamically disabling XNACK via ioctl does not update SVM range notifier flags, leaving THP unblocked and risking GPU page faults. Pre-existing issues: - [High] If a user maps SVM memory at virtual address 0, its MMU interval notifier is never removed, causing a Use-After-Free. -- commit f4df270ac94ba46190c602a81dedff1aa60cc86a Author: Yitao Jiang <[email protected]> drm/amdkfd: block THP for non-replayable SVM ranges KFD SVM ranges on processes without XNACK, and ranges requested as GPU_ALWAYS_MAPPED, cannot rely on replayable GPU faults after a CPU THP remap of the registered VA range. Keep those ranges backed by base pages while their interval notifier is active. Opt those SVM interval notifiers into MMU_INTERVAL_NOTIFIER_BLOCK_THP and update the flag when SVM attributes change. XNACK-enabled ranges that can handle remaps through replayable faults remain eligible for THP unless GPU_ALWAYS_MAPPED is requested. > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > index 3841943da5ece..0d0feba7b7852 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c [ ... ] > @@ -81,6 +82,26 @@ static const struct mmu_interval_notifier_ops > svm_range_mn_ops = { > .invalidate = svm_range_cpu_invalidate_pagetables, > }; > > +static unsigned int > +svm_range_mn_flags(struct svm_range *prange) > +{ > + struct kfd_process *p = container_of(prange->svms, struct kfd_process, > + svms); > + > + if (!p->xnack_enabled || > + (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) > + return MMU_INTERVAL_NOTIFIER_BLOCK_THP; > + > + return 0; > +} [Severity: High] When a process dynamically disables XNACK using the KFD_IOC_SMI_EVENTS ioctl, does this leave THP unblocked on existing ranges? In svm_range_switch_xnack_reserve_mem(), p->xnack_enabled is updated dynamically: p->xnack_enabled = xnack_enabled; However, it does not seem to update the interval notifier flags for existing SVM ranges. Because the existing ranges remain with THP unblocked, could khugepaged collapse THPs and generate unrecoverable GPU page faults since XNACK is disabled? > + > +static int > +svm_range_update_mn_flags_locked(struct svm_range *prange) > +{ > + return mmu_interval_notifier_set_flags_locked(&prange->notifier, > + > svm_range_mn_flags(prange)); > +} [Severity: High] If userspace clears the KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag or XNACK is enabled, does this permanently leak a THP block node? When svm_range_update_mn_flags_locked() calls mmu_interval_notifier_set_flags_locked() with flags = 0, the interval_sub->flags are zeroed out locally. However, since there is no teardown path to call mmu_interval_notifier_unblock_thp_locked(), the THP block node remains. When the SVM range is eventually freed, mmu_interval_notifier_remove() will also skip the unblock path because the flag was cleared, resulting in a permanent memory leak and permanently disabling THP for the virtual address range. [ ... ] > @@ -112,10 +133,11 @@ svm_range_add_notifier_locked(struct mm_struct *mm, > struct svm_range *prange) > pr_debug("svms 0x%p prange 0x%p [0x%lx 0x%lx]\n", prange->svms, > prange, prange->start, prange->last); > > - mmu_interval_notifier_insert_locked(&prange->notifier, mm, > - prange->start << PAGE_SHIFT, > - prange->npages << PAGE_SHIFT, > - &svm_range_mn_ops); > + mmu_interval_notifier_insert_locked_flags(&prange->notifier, mm, > + prange->start << PAGE_SHIFT, > + prange->npages << PAGE_SHIFT, > + &svm_range_mn_ops, > + svm_range_mn_flags(prange)); > } [Severity: High] Does this silently ignore failures during interval notifier registration? If mmu_interval_notifier_insert_locked_flags() fails, it returns an error code and prange->notifier.mm remains NULL. Because the return value is discarded here, subsequent code in svm_range_set_attr() will call svm_range_update_mn_flags_locked() on the same transaction. This passes the uninitialized notifier to mmu_interval_notifier_set_flags_locked(), which will hit a WARN_ON_ONCE(!mm), return -EINVAL, and cause the transaction to abort, potentially corrupting the interval tree. [ ... ] > @@ -3763,6 +3785,12 @@ svm_range_set_attr(struct kfd_process *p, struct > mm_struct *mm, > } > list_for_each_entry(prange, &update_list, update_list) { > svm_range_apply_attrs(p, prange, nattr, attrs, &update_mapping); > + r = svm_range_update_mn_flags_locked(prange); > + if (r) { > + mutex_unlock(&svms->lock); > + mmap_write_unlock(mm); > + goto out; > + } [Severity: High] Does aborting the transaction here leak memory and corrupt the interval tree? If svm_range_update_mn_flags_locked() fails, the goto out path completely bypasses the remove_list processing located further down in the function. This leaves old cloned ranges permanently in the interval tree alongside their newly inserted replacements that were added during the insert_list loop. Additionally, because remove_list is a stack variable, skipping its teardown leaks the ranges and leaves dangling pointers in the update_list links. > /* TODO: unmap ranges from GPU that lost access */ > } > update_mapping |= !p->xnack_enabled && !list_empty(&remap_list); [Severity: High] This is a pre-existing issue, but does svm_range_remove_notifier() fail to remove the notifier if the SVM range starts at virtual address 0? In drivers/gpu/drm/amd/amdkfd/kfd_svm.c:svm_range_remove_notifier(): if (prange->notifier.interval_tree.start != 0 && prange->notifier.interval_tree.last != 0) mmu_interval_notifier_remove(&prange->notifier); If an SVM range is mapped starting at virtual address 0 (which is permitted if vm.mmap_min_addr is 0 or if the user has CAP_SYS_RAWIO), the check evaluates to false and incorrectly skips mmu_interval_notifier_remove(). The prange is subsequently freed, but its embedded notifier structure remains linked in the mm->notifier_subscriptions->itree, which will cause a use-after-free the next time the tree is modified. -- Sashiko AI review · https://sashiko.dev/#/patchset/sy1pr01mb10596eb75463208a8e1ebba0fc0...@sy1pr01mb10596.ausprd01.prod.outlook.com?part=3
