On Wed, Sep 10, 2025 at 8:14 AM Prike Liang <prike.li...@amd.com> wrote: > > Keeping waiting the userq fence infinitely untill > hang detection, and then suspend the hang queue and > set the fence error. > > Signed-off-by: Prike Liang <prike.li...@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > index 7b7dae436e5e..2626a41a8418 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > @@ -199,7 +199,7 @@ amdgpu_userq_map_helper(struct amdgpu_userq_mgr *uq_mgr, > return r; > } > > -static void > +static int > amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr, > struct amdgpu_usermode_queue *queue) > { > @@ -207,11 +207,16 @@ amdgpu_userq_wait_for_last_fence(struct > amdgpu_userq_mgr *uq_mgr, > int ret; > > if (f && !dma_fence_is_signaled(f)) { > - ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100)); > - if (ret <= 0) > + ret = dma_fence_wait_timeout(f, true, MAX_SCHEDULE_TIMEOUT); > + if (ret <= 0) { > drm_file_err(uq_mgr->file, "Timed out waiting for > fence=%llu:%llu\n", > f->context, f->seqno); > + queue->state = AMDGPU_USERQ_STATE_HUNG;
I don't think we want to set the queue state to hung here. Just return an error. We'll detect the hang when we attempt to preempt the queues and run detect_and_reset(). Alex > + return -ETIME; > + } > } > + > + return ret; > } > > static void > -- > 2.34.1 >