Hi, On 2025-08-05 13:51:41 +0200, Klara Modin wrote: > Hi, > > On 2025-07-30 14:58:54 -0400, roman...@amd.com wrote: > > From: Aurabindo Pillai <aurabindo.pil...@amd.com> > > > > Accessing DC from amdgpu_dm is usually preceded by acquisition of > > dc_lock mutex. Most of the DC API that DM calls are under a DC lock. > > However, there are a few that are not. Some DC API called from interrupt > > context end up sending DMUB commands via a DC API, while other threads were > > using DMUB. This was apparent from a race between calls for setting idle > > optimization enable/disable and the DC API to set vmin/vmax. > > > > Offload the call to dc_stream_adjust_vmin_vmax() to a thread instead > > of directly calling them from the interrupt handler such that it waits > > for dc_lock. > > > > Reviewed-by: Nicholas Kazlauskas <nicholas.kazlaus...@amd.com> > > Signed-off-by: Aurabindo Pillai <aurabindo.pil...@amd.com> > > Signed-off-by: Roman Li <roman...@amd.com> > > With this patch I get sleeping function called from invalid context > roughly every second. This occurs on at least PREEMPT_LAZY and > PREEMPT_VOLUNTARY. > ... > ... > > > +static void schedule_dc_vmin_vmax(struct amdgpu_device *adev, > > + struct dc_stream_state *stream, > > + struct dc_crtc_timing_adjust *adjust) > > +{ > > + struct vupdate_offload_work *offload_work = > > kzalloc(sizeof(*offload_work), GFP_KERNEL); > > + if (!offload_work) { > > + drm_dbg_driver(adev_to_drm(adev), "Failed to allocate > > vupdate_offload_work\n"); > > + return; > > + } > > + > > + struct dc_crtc_timing_adjust *adjust_copy = > > kzalloc(sizeof(*adjust_copy), GFP_KERNEL); > > + if (!adjust_copy) { > > + drm_dbg_driver(adev_to_drm(adev), "Failed to allocate > > adjust_copy\n"); > > + kfree(offload_work); > > + return; > > + } > > + > > + dc_stream_retain(stream); > > + memcpy(adjust_copy, adjust, sizeof(*adjust_copy)); > > + > > + INIT_WORK(&offload_work->work, dm_handle_vmin_vmax_update); > > + offload_work->adev = adev; > > + offload_work->stream = stream; > > + offload_work->adjust = adjust_copy; > > + > > + queue_work(system_wq, &offload_work->work); > > +} > > + > > The allocations in this function seems to be the culprit. If I change > them to GFP_ATOMIC the issue disappears: > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 31ea57edeb45..afe0fea13bb0 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -550,13 +550,13 @@ static void schedule_dc_vmin_vmax(struct amdgpu_device > *adev, > struct dc_stream_state *stream, > struct dc_crtc_timing_adjust *adjust) > { > - struct vupdate_offload_work *offload_work = > kzalloc(sizeof(*offload_work), GFP_KERNEL); > + struct vupdate_offload_work *offload_work = > kzalloc(sizeof(*offload_work), GFP_ATOMIC); > if (!offload_work) { > drm_dbg_driver(adev_to_drm(adev), "Failed to allocate > vupdate_offload_work\n"); > return; > } > > - struct dc_crtc_timing_adjust *adjust_copy = > kzalloc(sizeof(*adjust_copy), GFP_KERNEL); > + struct dc_crtc_timing_adjust *adjust_copy = > kzalloc(sizeof(*adjust_copy), GFP_ATOMIC); > if (!adjust_copy) { > drm_dbg_driver(adev_to_drm(adev), "Failed to allocate > adjust_copy\n"); > kfree(offload_work); >
Any thoughts? This is still an issue in the current next. I have also tried using GFP_NOWAIT instead and not seen any obvious issues under memory pressure. Regards, Klara Modin