On Wed, Oct 15, 2025 at 10:31 AM Mario Limonciello <[email protected]> wrote: > > On 10/15/25 9:19 AM, Harry Wentland wrote: > > > > > > On 2025-10-14 17:38, Mario Limonciello wrote: > >> > >> > >> On 10/14/2025 4:27 PM, Alex Deucher wrote: > >>> On Tue, Oct 14, 2025 at 3:46 PM Mario Limonciello > >>> <[email protected]> wrote: > >>>> > >>>> [Why] > >>>> If userspace hasn't frozen user processes (like systemd does with > >>>> user.slice) then an aborted hibernate could give control back to > >>>> userspace before display hardware is resumed. IoW an atomic commit could > >>>> be done while the hardware is in D3, which could hang a system. > >>> > >>> Is there any way to prevent this altogether? > >> > >> The obvious way is that userspace should be freezing user processes. > >> That's what systemd does. > >> > >>> This seems like a recipe > >>> for trouble for any driver. > >> > >> If we want to uplevel this kind of check I think we would need some helper > >> to report hardware status into drm and drm would have to call that. > >> > >> Most distros use systemd, and this only happened in an aborted hibernate. > >> I guess I'd like to see how much this warning actually comes up before > >> deciding if all that plumbing is worth it. > > > > I was briefly thinking about a generic solution as well and don't > > think it's worth it at this point. This is mostly about internal > > driver/HW state. > > > > Harry > > FWIW IGT testing seems to have thrown up some problems with this which > didn't show up in my unit testing. > > [ 471.261018] ? drm_atomic_helper_resume+0x2b/0x150 [drm_kms_helper] > [ 471.261031] ? dm_resume+0x459/0x8f0 [amdgpu] > [ 471.261396] ? amdgpu_ip_block_resume+0x27/0x70 [amdgpu] > [ 471.261635] ? dpm_run_callback+0x9c/0x200 > [ 471.261640] ? device_resume+0x1b4/0x2b0 > [ 471.261645] drm_atomic_check_only+0x5d9/0x9e0 [drm] > [ 471.261672] drm_atomic_commit+0x6d/0xe0 [drm] > [ 471.261697] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 471.261729] drm_atomic_helper_commit_duplicated_state+0xcd/0xe0 > [drm_kms_helper] > [ 471.261739] drm_atomic_helper_resume+0x93/0x150 [drm_kms_helper] > [ 471.261751] dm_resume+0x459/0x8f0 [amdgpu] > [ 471.262119] ? preempt_count_add+0x7b/0xc0 > [ 471.262125] amdgpu_ip_block_resume+0x27/0x70 [amdgpu] > [ 471.262363] amdgpu_device_ip_resume_phase3+0x54/0x80 [amdgpu] > [ 471.262598] amdgpu_device_resume+0x188/0x3b0 [amdgpu] > [ 471.262842] amdgpu_pmops_resume+0x4c/0x90 [amdgpu] > [ 471.263078] pci_pm_resume+0x6b/0x100 > > The issue appears to me to be that ip_block->status.hw isn't set again > until the end of amdgpu_ip_block_resume(). > > I am tempted to move it before the call to > ip_block->version->funcs->resume(). > > Thoughts?
Then it won't actually reflect the hw state. Alex Alex > > > > >>> > >>> Alex > >>> > >>>> > >>>> [How] > >>>> Add a check whether the IP block hardware is ready to the atomic check > >>>> handler and return a failure. Userspace shouldn't do an atomic commit if > >>>> the atomic check fails. > >>>> > >>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4627 > >>>> Signed-off-by: Mario Limonciello <[email protected]> > >>>> --- > >>>> Cc: Harry Wentland <[email protected]> > >>>> v2: > >>>> * Return -EBUSY instead (Harry) > >>>> --- > >>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +++++ > >>>> 1 file changed, 5 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>> index 6446ec6c21d4..f5cd9982af99 100644 > >>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > >>>> @@ -12010,6 +12010,11 @@ static int amdgpu_dm_atomic_check(struct > >>>> drm_device *dev, > >>>> > >>>> trace_amdgpu_dm_atomic_check_begin(state); > >>>> > >>>> + if (WARN_ON(unlikely(!amdgpu_device_ip_is_hw(adev, > >>>> AMD_IP_BLOCK_TYPE_DCE)))) { > >>>> + ret = -EBUSY; > >>>> + goto fail; > >>>> + } > >>>> + > >>>> ret = drm_atomic_helper_check_modeset(dev, state); > >>>> if (ret) { > >>>> drm_dbg_atomic(dev, "drm_atomic_helper_check_modeset() > >>>> failed\n"); > >>>> -- > >>>> 2.49.0 > >>>> > >> > > >
