Move kfd_process_dequeue_from_all_devices from kfd_process_wq_release to mmu notifier release callback, do it earlier to ensure no system memory access from GPU because the process memory is going to free from CPU after mmu release notifier callback returns.
Suggested-by: Felix Kuehling <[email protected]> Signed-off-by: Philip Yang <[email protected]> --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 849456ac498b..9080d23d22ae 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1162,7 +1162,6 @@ static void kfd_process_wq_release(struct work_struct *work) release_work); struct dma_fence *ef; - kfd_process_dequeue_from_all_devices(p); pqm_uninit(&p->pqm); /* @@ -1226,6 +1225,13 @@ static void kfd_process_notifier_release_internal(struct kfd_process *p) cancel_delayed_work_sync(&p->eviction_work); cancel_delayed_work_sync(&p->restore_work); + /* + * Evict and remove user queues because exit_mmap free process memory, + * it is not safe for GPU to access system memory after mmu release + * notifier callback returns. + */ + kfd_process_dequeue_from_all_devices(p); + for (i = 0; i < p->n_pdds; i++) { struct kfd_process_device *pdd = p->pdds[i]; -- 2.49.0
