On 04.07.25 12:33, Prike Liang wrote: > The userq VA unmap requires validating queue status before unamapping > it, if user tries to unmap a busy userq by GEM VA IOCTL then the > driver should report an error for this illegal usage.
Clear NAK to the whole approach. We should never deny unmapping a VA because it is used by an userqueue. This can cause a rat tail of problems in userspace. Instead we *must* suspend the userqueue when the VA is unmapped and deny resuming it. I think we can do that by adjusting the usage of the eviction fence for the BOs used by the user queue. Regards, Christian. > > Signed-off-by: Prike Liang <prike.li...@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 16 +++++++++++++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++++ > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > index 30838e5279bd..221292b6417a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > @@ -281,7 +281,7 @@ amdgpu_userq_map_helper(struct amdgpu_userq_mgr *uq_mgr, > return r; > } > > -static void > +static int > amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr, > struct amdgpu_usermode_queue *queue) > { > @@ -290,10 +290,14 @@ amdgpu_userq_wait_for_last_fence(struct > amdgpu_userq_mgr *uq_mgr, > > if (f && !dma_fence_is_signaled(f)) { > ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100)); > - if (ret <= 0) > + if (ret <= 0) { > drm_file_err(uq_mgr->file, "Timed out waiting for > fence=%llu:%llu\n", > f->context, f->seqno); > + return -ETIMEDOUT; > + } > } > + > + return 0; > } > > static void > @@ -509,7 +513,13 @@ amdgpu_userq_destroy(struct drm_file *filp, int queue_id) > mutex_unlock(&uq_mgr->userq_mutex); > return -EINVAL; > } > - amdgpu_userq_wait_for_last_fence(uq_mgr, queue); > + > + if (amdgpu_userq_wait_for_last_fence(uq_mgr, queue)) { > + drm_warn(adev_to_drm(uq_mgr->adev), "Don't destroy a busy > userq\n"); > + /* For the fence signal timeout case, it requires resetting the > busy queue.*/ > + r = -ETIMEDOUT; > + } > + > r = amdgpu_bo_reserve(queue->db_obj.obj, true); > if (!r) { > amdgpu_bo_unpin(queue->db_obj.obj); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index f042372d9f2e..c8cdd668a8f2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1929,6 +1929,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev, > struct amdgpu_bo_va_mapping *mapping; > struct amdgpu_vm *vm = bo_va->base.vm; > bool valid = true; > + int r; > > saddr /= AMDGPU_GPU_PAGE_SIZE; > > @@ -1949,6 +1950,14 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev, > return -ENOENT; > } > > + /* It's unlikely to happen that the mapping userq hasn't been idled > + * during user requests GEM unmap IOCTL except for forcing the unmap > + * from user space. > + */ > + r = amdgpu_userq_gem_va_unmap_validate(vm, saddr); > + if (unlikely(r && r != -EBUSY)) > + dev_warn(adev->dev, "Here should be an improper unmap request > from user space\n"); > + > list_del(&mapping->list); > amdgpu_vm_it_remove(mapping, &vm->va); > mapping->bo_va = NULL;