On 2025-10-14 15:46, Felix Kuehling wrote:
On 2025-10-10 15:33, Philip Yang wrote:
If active queue buffer is freed, kfd_lookup_process_by_mm return NULL,
means process exited and mm is gone, it is fine to evict queue then
free queue buffer CPU mapping and memory from do_exit.
In that case, kfd2kgd_quiesce_mm will also fail with -ESRCH. I'm
surprised you're getting here at all. I would have expected the queues
to be already stopped when the process is gone. But it seems that's
only done in the kfd_process_wq_release worker. So is there a time
window where the queues are still running, but the queue mappings are
destroyed and the queues can't be stopped because we can't look up the
process from mm any more?
yes, we should only show warning message if process mm is still alive
when queue buffer is freed and evict the queues.
Maybe we need to stop the queues in kfd_process_notifier_release to be
safe. It should only need the DQM lock, which should be safe to take
in an MMU notifiers context.
There is race that queue is running when svm is unmapped on CPU,will add
another patch in v2 to stop user queues when mm release notifier.
Thanks.
Philip
Regards,
Felix
Only show warning message if process mm is still alive when queue
buffer is freed.
Fixes: b049504e211e ("drm/amdkfd: Validate user queue svm memory
residency")
Signed-off-by: Philip Yang <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 48c9a211e415..9174f718482a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2487,17 +2487,26 @@ svm_range_unmap_from_cpu(struct mm_struct
*mm, struct svm_range *prange,
bool unmap_parent;
uint32_t i;
+ p = kfd_lookup_process_by_mm(mm);
+
if (atomic_read(&prange->queue_refcount)) {
int r;
- pr_warn("Freeing queue vital buffer 0x%lx, queue evicted\n",
- prange->start << PAGE_SHIFT);
+ /*
+ * Evict queue if queue buffer freed with warning message.
+ * If process is not found, this is free CPU mapping from
+ * do_exit, then it is fine to free queue buffer.
+ */
+ if (p) {
+ pr_warn("Freeing queue vital buffer 0x%lx, queue
evicted\n",
+ prange->start << PAGE_SHIFT);
+ }
+
r = kgd2kfd_quiesce_mm(mm, KFD_QUEUE_EVICTION_TRIGGER_SVM);
if (r)
pr_debug("failed %d to quiesce KFD queues\n", r);
}
- p = kfd_lookup_process_by_mm(mm);
if (!p)
return;
svms = &p->svms;