AMD General

Regards,
      Prike

> -----Original Message-----
> From: Koenig, Christian <[email protected]>
> Sent: Tuesday, May 26, 2026 6:48 PM
> To: Liang, Prike <[email protected]>; [email protected]
> Cc: Deucher, Alexander <[email protected]>
> Subject: Re: [PATCH 1/3] drm/amdgpu: avoid extracting fence_drv_array for 
> empty
> wait fences
>
>
>
> On 5/26/26 11:32, Prike Liang wrote:
> > Avoid xarray extraction and temporary array allocation in
> > amdgpu_userq_fence_alloc() when there are no pending wait-side fence
> > driver references. This keeps the common fence emit path cheaper and
> > efficient.
>
> That's an absolute corner case we clearly don't need to optimize for.
>
> In almost all cases we should have at least one remote fence driver here.

When only the desktop compositor is running, there're many no-wait fences are 
generated while emitting userq fences. Repeatedly attempting to extract the 
wait fence array takes more than 10µs (with a maximum cost of around 30µs). 
Additionally, zero-initializing the userq fence allocation can help reduce 
overhead in the userq fence put routine.

This patch can return a userq fence driver even when falling back from an empty 
fence_drv_xa, benefiting on reducing the latency of userq fence driver 
extraction and free operations when there is no pending wait-side fence.

> Regards,
> Christian.
>
> >
> > Signed-off-by: Prike Liang <[email protected]>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > index 008330a0d852..2a2bf13a513d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> > @@ -226,7 +226,7 @@ static int amdgpu_userq_fence_alloc(struct
> amdgpu_usermode_queue *userq,
> >     struct amdgpu_userq_fence *userq_fence;
> >     void *entry;
> >
> > -   userq_fence = kmalloc(sizeof(*userq_fence), GFP_KERNEL);
> > +   userq_fence = kzalloc(sizeof(*userq_fence), GFP_KERNEL);
> >     if (!userq_fence)
> >             return -ENOMEM;
> >
> > @@ -235,6 +235,8 @@ static int amdgpu_userq_fence_alloc(struct
> amdgpu_usermode_queue *userq,
> >      * used as size to allocate the array.
> >      */
> >     mutex_lock(&userq->fence_drv_lock);
> > +   if (xa_empty(&userq->fence_drv_xa))
> > +           goto unlock;
> >     XA_STATE(xas, &userq->fence_drv_xa, 0);
> >
> >     rcu_read_lock();
> > @@ -256,7 +258,7 @@ static int amdgpu_userq_fence_alloc(struct
> amdgpu_usermode_queue *userq,
> >     xa_extract(&userq->fence_drv_xa, (void **)userq_fence->fence_drv_array,
> >                0, ULONG_MAX, xas.xa_index, XA_PRESENT);
> >     xa_destroy(&userq->fence_drv_xa);
> > -
> > +unlock:
> >     mutex_unlock(&userq->fence_drv_lock);
> >
> >     amdgpu_userq_fence_driver_get(fence_drv);

Reply via email to