From: Kent Russell <[email protected]>

[ Upstream commit 0ed704d058cec7643a716a21888d58c7d03f2c3e ]

HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..

If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.

Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---

LLM Generated explanations, may be completely bogus:

YES

Explanation
- Bug fixed: The change addresses an indefinite retry loop (apparent
  hang) in HMM-backed SVM mapping when encountering VMAs without read
  permission, specifically write-only mappings and PROT_NONE. The loop
  is triggered because HMM assumes READ by default and the existing code
  adds READ then WRITE in svm_range_validate_and_map. That conflicts
  with mappings that lack READ and causes svm_range_restore_work to
  silently retry forever.
- Core change: In drivers/gpu/drm/amd/amdkfd/kfd_svm.c inside
  svm_range_validate_and_map, after resolving the VMA and before calling
  amdgpu_hmm_range_get_pages, the patch adds a guard:
  - Check: if (!(vma->vm_flags & VM_READ)) { … continue; }
  - Behavior on no-READ: Acquire range lock, optionally pr_debug if
    VM_WRITE is set without VM_READ, compute the intersection of the
    current address range with prange, call
    svm_range_unmap_from_gpus(prange, s, e,
    KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU), unlock, advance addr to next,
    and continue.
  - This explicitly treats PROT_NONE and write-only VMAs as unmappable
    for HMM/SVM and avoids mapping attempts that will never succeed.
- Containment: The change is localized to a single function and code
  path used during SVM range validation/mapping. No APIs or data
  structures are changed. It only affects the slow path when
  encountering a VMA without VM_READ; normal mappings (with READ) follow
  the existing flow unchanged.
- User impact: Prevents a hang-like condition (endless retry) that users
  would experience during SVM range restoration/validation when a VMA
  has PROT_NONE or write-only protection. This is a real, user-facing
  bug that can stall workloads using KFD/HMM.
- Risk assessment:
  - Minimal risk: The unmap is a no-op if the pages were not mapped (“If
    they weren't mapped in the first place, this would be a no-op”), and
    otherwise it correctly tears down mappings that cannot be used by
    HMM anyway (x86 doesn’t support write-only, and PROT_NONE allows
    neither reads nor writes).
  - No architectural changes: No redesign or wide-reaching behavior
    changes outside of this corner case. Logging is via pr_debug, so no
    noisy kernel logs in production.
- Stable criteria:
  - Important bugfix affecting users: Yes (prevents indefinite
    retry/hang).
  - Small and contained: Yes (~20 lines, one function).
  - No new features: Correct.
  - No broad side effects: Correct; behavior is limited to VMAs lacking
    READ, which cannot be supported by HMM.
  - Explicit stable tag: Not shown in the snippet, but the fix clearly
    meets stable backport guidelines due to the hang avoidance and
    limited scope.

Notes for backporters
- Interface compatibility: The patch relies on existing primitives
  present in amdkfd SVM code paths: vma_lookup, svm_range_lock/unlock,
  svm_range_unmap_from_gpus, KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU, and
  prange fields (start/last). These are stable in recent kernels that
  have SVM/HMM in KFD.
- Unit consistency check: Ensure the parameters passed to
  svm_range_unmap_from_gpus(prange, s, e, …) are in the units expected
  by your target stable branch. In the shown diff, s/e are computed as s
  = max(start, prange->start) and e = min(end, prange->last) where
  start/end appear to be byte addresses (start = map_start <<
  PAGE_SHIFT) and prange->start/last are often page indices in KFD SVM
  code. Upstream code likely uses consistent units (either all pages or
  all bytes). When backporting, verify that s/e match the function’s
  expected units (adjust by PAGE_SHIFT if necessary) to avoid off-by-
  PAGE_SHIFT mistakes.
- Validation suggestion: Reproduce with a user VMA set to PROT_NONE or
  write-only protection and trigger SVM range validation (e.g., by
  causing GPU access). Before the fix, svm_range_restore_work would
  continuously retry; after the fix, the range is unmapped and
  validation proceeds without looping.

Why this is safe and needed
- The patch turns an unrecoverable, retry-forever condition into a
  deterministic handling path by unmapping and moving on. It does not
  try to force unsupported permissions and does not alter behavior for
  the common case. It matches HMM’s requirement that mappings be at
  least readable and avoids futile retry cycles. This is precisely the
  kind of small, correctness-oriented fix that minimizes regression risk
  and improves robustness for users of amdkfd/HMM.

 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 3d8b20828c068..cecdbcea0bb90 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1714,6 +1714,29 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
 
                        next = min(vma->vm_end, end);
                        npages = (next - addr) >> PAGE_SHIFT;
+                       /* HMM requires at least READ permissions. If provided 
with PROT_NONE,
+                        * unmap the memory. If it's not already mapped, this 
is a no-op
+                        * If PROT_WRITE is provided without READ, warn first 
then unmap
+                        */
+                       if (!(vma->vm_flags & VM_READ)) {
+                               unsigned long e, s;
+
+                               svm_range_lock(prange);
+                               if (vma->vm_flags & VM_WRITE)
+                                       pr_debug("VM_WRITE without VM_READ is 
not supported");
+                               s = max(start, prange->start);
+                               e = min(end, prange->last);
+                               if (e >= s)
+                                       r = svm_range_unmap_from_gpus(prange, 
s, e,
+                                                      
KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+                               svm_range_unlock(prange);
+                               /* If unmap returns non-zero, we'll bail on the 
next for loop
+                                * iteration, so just leave r and continue
+                                */
+                               addr = next;
+                               continue;
+                       }
+
                        WRITE_ONCE(p->svms.faulting_task, current);
                        r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, 
npages,
                                                       readonly, owner, NULL,
-- 
2.51.0

Reply via email to