display: fix dmub access race condition

Klara Modin Mon, 18 Aug 2025 06:03:26 -0700

Hi,

On 2025-08-05 13:51:41 +0200, Klara Modin wrote:
> Hi,
> 
> On 2025-07-30 14:58:54 -0400, roman...@amd.com wrote:
> > From: Aurabindo Pillai <aurabindo.pil...@amd.com>
> > 
> > Accessing DC from amdgpu_dm is usually preceded by acquisition of
> > dc_lock mutex. Most of the DC API that DM calls are under a DC lock.
> > However, there are a few that are not. Some DC API called from interrupt
> > context end up sending DMUB commands via a DC API, while other threads were
> > using DMUB. This was apparent from a race between calls for setting idle
> > optimization enable/disable and the DC API to set vmin/vmax.
> > 
> > Offload the call to dc_stream_adjust_vmin_vmax() to a thread instead
> > of directly calling them from the interrupt handler such that it waits
> > for dc_lock.
> > 
> > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlaus...@amd.com>
> > Signed-off-by: Aurabindo Pillai <aurabindo.pil...@amd.com>
> > Signed-off-by: Roman Li <roman...@amd.com>
> 
> With this patch I get sleeping function called from invalid context
> roughly every second. This occurs on at least PREEMPT_LAZY and
> PREEMPT_VOLUNTARY.
> 
...
> ...
> 
> > +static void schedule_dc_vmin_vmax(struct amdgpu_device *adev,
> > +   struct dc_stream_state *stream,
> > +   struct dc_crtc_timing_adjust *adjust)
> > +{
> > +   struct vupdate_offload_work *offload_work = 
> > kzalloc(sizeof(*offload_work), GFP_KERNEL);
> > +   if (!offload_work) {
> > +           drm_dbg_driver(adev_to_drm(adev), "Failed to allocate 
> > vupdate_offload_work\n");
> > +           return;
> > +   }
> > +
> > +   struct dc_crtc_timing_adjust *adjust_copy = 
> > kzalloc(sizeof(*adjust_copy), GFP_KERNEL);
> > +   if (!adjust_copy) {
> > +           drm_dbg_driver(adev_to_drm(adev), "Failed to allocate 
> > adjust_copy\n");
> > +           kfree(offload_work);
> > +           return;
> > +   }
> > +
> > +   dc_stream_retain(stream);
> > +   memcpy(adjust_copy, adjust, sizeof(*adjust_copy));
> > +
> > +   INIT_WORK(&offload_work->work, dm_handle_vmin_vmax_update);
> > +   offload_work->adev = adev;
> > +   offload_work->stream = stream;
> > +   offload_work->adjust = adjust_copy;
> > +
> > +   queue_work(system_wq, &offload_work->work);
> > +}
> > +
> 
> The allocations in this function seems to be the culprit. If I change
> them to GFP_ATOMIC the issue disappears:
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 31ea57edeb45..afe0fea13bb0 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -550,13 +550,13 @@ static void schedule_dc_vmin_vmax(struct amdgpu_device 
> *adev,
>       struct dc_stream_state *stream,
>       struct dc_crtc_timing_adjust *adjust)
>  {
> -     struct vupdate_offload_work *offload_work = 
> kzalloc(sizeof(*offload_work), GFP_KERNEL);
> +     struct vupdate_offload_work *offload_work = 
> kzalloc(sizeof(*offload_work), GFP_ATOMIC);
>       if (!offload_work) {
>               drm_dbg_driver(adev_to_drm(adev), "Failed to allocate 
> vupdate_offload_work\n");
>               return;
>       }
>  
> -     struct dc_crtc_timing_adjust *adjust_copy = 
> kzalloc(sizeof(*adjust_copy), GFP_KERNEL);
> +     struct dc_crtc_timing_adjust *adjust_copy = 
> kzalloc(sizeof(*adjust_copy), GFP_ATOMIC);
>       if (!adjust_copy) {
>               drm_dbg_driver(adev_to_drm(adev), "Failed to allocate 
> adjust_copy\n");
>               kfree(offload_work);
>


Any thoughts? This is still an issue in the current next. I have also
tried using GFP_NOWAIT instead and not seen any obvious issues under
memory pressure.

Regards,
Klara Modin

Re: [PATCH 03/12] drm/amd/display: fix dmub access race condition

Reply via email to