On Wed, Sep 10, 2025 at 8:14 AM Prike Liang <prike.li...@amd.com> wrote:
>
> Keeping waiting the userq fence infinitely untill
> hang detection, and then suspend the hang queue and
> set the fence error.
>
> Signed-off-by: Prike Liang <prike.li...@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> index 7b7dae436e5e..2626a41a8418 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> @@ -199,7 +199,7 @@ amdgpu_userq_map_helper(struct amdgpu_userq_mgr *uq_mgr,
>         return r;
>  }
>
> -static void
> +static int
>  amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr,
>                                  struct amdgpu_usermode_queue *queue)
>  {
> @@ -207,11 +207,16 @@ amdgpu_userq_wait_for_last_fence(struct 
> amdgpu_userq_mgr *uq_mgr,
>         int ret;
>
>         if (f && !dma_fence_is_signaled(f)) {
> -               ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100));
> -               if (ret <= 0)
> +               ret = dma_fence_wait_timeout(f, true, MAX_SCHEDULE_TIMEOUT);
> +               if (ret <= 0) {
>                         drm_file_err(uq_mgr->file, "Timed out waiting for 
> fence=%llu:%llu\n",
>                                      f->context, f->seqno);
> +                       queue->state = AMDGPU_USERQ_STATE_HUNG;

I don't think we want to set the queue state to hung here.  Just
return an error.  We'll detect the hang when we attempt to preempt the
queues and run detect_and_reset().

Alex

> +                       return -ETIME;
> +               }
>         }
> +
> +       return ret;
>  }
>
>  static void
> --
> 2.34.1
>

Reply via email to