On 3/9/26 03:22, Chenyuan Mi wrote:
> amdgpu_userq_wait_ioctl() accesses the wait queue object obtained
> from xa_load() without holding userq_mutex or taking a reference on
> the queue. A concurrent AMDGPU_USERQ_OP_FREE call can destroy and
> free the queue between the xa_load() and the subsequent
> xa_alloc(&waitq->fence_drv_xa, ...), resulting in a use-after-free.
>
> This is a regression introduced by commit 4b27406380b0
> ("drm/amdgpu: Add queue id support to the user queue wait IOCTL"),
> which removed the indirect fence_drv_xa_ptr model and its NULL
> check safety net from commit ed5fdc1fc282 ("drm/amdgpu: Fix the
> use-after-free issue in wait IOCTL") and replaced it with a direct
> waitq->fence_drv_xa access, but did not add any lifetime protection
> around the new waitq pointer.
>
> Fix this by holding userq_mutex across the xa_load() and the
> subsequent fence_drv_xa operations, matching the locking used by
> the destroy path.
>
> Fixes: 4b27406380b0 ("drm/amdgpu: Add queue id support to the user queue wait
> IOCTL")
> Cc: [email protected]
> Signed-off-by: Chenyuan Mi <[email protected]>
Well this trivially causes a deadlock.
The correct fix has already been published by Sunil quite a while ago.
Regards,
Christian.
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> index 8013260e29dc..1785ea7c18fe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> @@ -912,8 +912,10 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void
> *data,
> */
> num_fences = dma_fence_dedup_array(fences, num_fences);
>
> + mutex_lock(&userq_mgr->userq_mutex);
> waitq = xa_load(&userq_mgr->userq_xa, wait_info->waitq_id);
> if (!waitq) {
> + mutex_unlock(&userq_mgr->userq_mutex);
> r = -EINVAL;
> goto free_fences;
> }
> @@ -932,6 +934,7 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void
> *data,
> r = dma_fence_wait(fences[i], true);
> if (r) {
> dma_fence_put(fences[i]);
> + mutex_unlock(&userq_mgr->userq_mutex);
> goto free_fences;
> }
>
> @@ -948,8 +951,10 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void
> *data,
> */
> r = xa_alloc(&waitq->fence_drv_xa, &index, fence_drv,
> xa_limit_32b, GFP_KERNEL);
> - if (r)
> + if (r) {
> + mutex_unlock(&userq_mgr->userq_mutex);
> goto free_fences;
> + }
>
> amdgpu_userq_fence_driver_get(fence_drv);
>
> @@ -961,6 +966,7 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void
> *data,
> /* Increment the actual userq fence count */
> cnt++;
> }
> + mutex_unlock(&userq_mgr->userq_mutex);
>
> wait_info->num_fences = cnt;
> /* Copy userq fence info to user space */
> --
> 2.53.0
>