[Public]

> -----Original Message-----
> From: Limonciello, Mario <mario.limoncie...@amd.com>
> Sent: Tuesday, October 3, 2023 5:17 PM
> To: Deucher, Alexander <alexander.deuc...@amd.com>; amd-
> g...@lists.freedesktop.org
> Cc: Wentland, Harry <harry.wentl...@amd.com>
> Subject: Re: [PATCH v3 1/4] drm/amd: Add support for prepare() and
> complete() callbacks
>
> On 10/3/2023 16:11, Deucher, Alexander wrote:
> > [Public]
> >
> >> -----Original Message-----
> >> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of
> >> Mario Limonciello
> >> Sent: Tuesday, October 3, 2023 4:55 PM
> >> To: amd-gfx@lists.freedesktop.org
> >> Cc: Wentland, Harry <harry.wentl...@amd.com>; Limonciello, Mario
> >> <mario.limoncie...@amd.com>
> >> Subject: [PATCH v3 1/4] drm/amd: Add support for prepare() and
> >> complete() callbacks
> >>
> >> Linux PM core has a prepare() callback run before suspend and
> >> complete() callback ran after resume() for devices to use.  Add
> >> plumbing to bring
> >> prepare() to amdgpu.
> >>
> >> The idea with the new vfuncs for amdgpu is that all IP blocks that
> >> memory allocations during suspend should do the allocation from this
> >> call instead of the suspend() callback.
> >>
> >> By moving the allocations to prepare() the system suspend will be
> >> failed before any IP block has done any suspend code.
> >>
> >> If the suspend fails, then do any cleanups in the complete() callback.
> >>
> >> Signed-off-by: Mario Limonciello <mario.limoncie...@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 ++
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 39
> >> ++++++++++++++++++++--
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 11 +++---
> >>   3 files changed, 46 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> index 73e825d20259..5d651552822c 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> @@ -1415,6 +1415,8 @@ void amdgpu_driver_postclose_kms(struct
> >> drm_device *dev,  void amdgpu_driver_release_kms(struct drm_device
> >> *dev);
> >>
> >>   int amdgpu_device_ip_suspend(struct amdgpu_device *adev);
> >> +int amdgpu_device_prepare(struct drm_device *dev); void
> >> +amdgpu_device_complete(struct drm_device *dev);
> >>   int amdgpu_device_suspend(struct drm_device *dev, bool fbcon);  int
> >> amdgpu_device_resume(struct drm_device *dev, bool fbcon);
> >>   u32 amdgpu_get_vblank_counter_kms(struct drm_crtc *crtc); diff
> >> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> index bad2b5577e96..f53cf675c3ce 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >> @@ -4259,6 +4259,43 @@ static int
> >> amdgpu_device_evict_resources(struct
> >> amdgpu_device *adev)
> >>   /*
> >>    * Suspend & resume.
> >>    */
> >> +/**
> >> + * amdgpu_device_prepare - prepare for device suspend
> >> + *
> >> + * @dev: drm dev pointer
> >> + *
> >> + * Prepare to put the hw in the suspend state (all asics).
> >> + * Returns 0 for success or an error on failure.
> >> + * Called at driver suspend.
> >> + */
> >> +int amdgpu_device_prepare(struct drm_device *dev) {
> >> +     struct amdgpu_device *adev = drm_to_adev(dev);
> >> +     int r;
> >> +
> >> +     if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
> >> +             return 0;
> >> +
> >> +     adev->in_suspend = true;
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +/**
> >> + * amdgpu_device_complete - complete the device after resume
> >> + *
> >> + * @dev: drm dev pointer
> >> + *
> >> + * Clean up any actions that the prepare step did.
> >> + * Called after driver resume.
> >> + */
> >> +void amdgpu_device_complete(struct drm_device *dev) {
> >> +     struct amdgpu_device *adev = drm_to_adev(dev);
> >> +
> >> +     adev->in_suspend = false;
> >> +}
> >> +
> >>   /**
> >>    * amdgpu_device_suspend - initiate device suspend
> >>    *
> >> @@ -4277,8 +4314,6 @@ int amdgpu_device_suspend(struct drm_device
> >> *dev, bool fbcon)
> >>        if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
> >>                return 0;
> >>
> >> -     adev->in_suspend = true;
> >> -
> >
> > We also set this to false in amdgpu_device_resume() so that should be fixed
> up as well.  But, I'm not sure we want to move this out of
> amdgpu_device_suspend().  There are places we use
> amdgpu_device_suspend/resume() outside of pmops that also rely on these
> being set.  Those places may need to be fixed up if we do.  IIRC, the 
> switcheroo
> code uses this.
>
> The big reason that I moved it from suspend() to prepare() was so that
> amdgpu_device_evict_resources() was called with the context of it being set.
>
> My thought process:
> 0) prepare() sets all the time
> 1) If prepare() fails complete() clears it.
> 2) If prepare() succeeds it remains set for suspend()
> 3) If suspend() succeeds it gets cleared at resume()
> 4) If resume() failed for some reason, it's cleared by complete().
>
> Does it actually matter that it's set while evicting resources?

Shouldn't matter for evicting resources.  We even have debugfs nodes you can 
access to forcibly evict resources at runtime for testing memory pressure.

Alex

> If so, maybe I'll just have both prepare() and suspend() set it universally 
> and
> both resume() and complete() clear it universally.
>
>
> >
> > Alex
> >
> >>        /* Evict the majority of BOs before grabbing the full access */
> >>        r = amdgpu_device_evict_resources(adev);
> >>        if (r)
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> index e3471293846f..4c6fb852516a 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> @@ -2425,8 +2425,9 @@ static int amdgpu_pmops_prepare(struct device
> >> *dev)
> >>        /* Return a positive number here so
> >>         * DPM_FLAG_SMART_SUSPEND works properly
> >>         */
> >> -     if (amdgpu_device_supports_boco(drm_dev))
> >> -             return pm_runtime_suspended(dev);
> >> +     if (amdgpu_device_supports_boco(drm_dev) &&
> >> +         pm_runtime_suspended(dev))
> >> +             return 1;
> >>
> >>        /* if we will not support s3 or s2i for the device
> >>         *  then skip suspend
> >> @@ -2435,12 +2436,14 @@ static int amdgpu_pmops_prepare(struct
> device
> >> *dev)
> >>            !amdgpu_acpi_is_s3_active(adev))
> >>                return 1;
> >>
> >> -     return 0;
> >> +     return amdgpu_device_prepare(drm_dev);
> >>   }
> >>
> >>   static void amdgpu_pmops_complete(struct device *dev)  {
> >> -     /* nothing to do */
> >> +     struct drm_device *drm_dev = dev_get_drvdata(dev);
> >> +
> >> +     amdgpu_device_complete(drm_dev);
> >>   }
> >>
> >>   static int amdgpu_pmops_suspend(struct device *dev)
> >> --
> >> 2.34.1
> >

Reply via email to