On Mon, Sep 15, 2025 at 9:20 PM Mario Limonciello
<mario.limoncie...@amd.com> wrote:
>
> KFD suspend and resume routines have been disabled since commit
> 5d3a2d95224da ("drm/amdgpu: skip kfd suspend/resume for S0ix") which
> made sense at that time.  However there is a problem that if there is
> any compute work running there may still be active fences.  Running
> suspend without draining them can cause the system to hang.
>
> So run KFD suspend/resume routines even in s0ix.
>
> Signed-off-by: Mario Limonciello <mario.limoncie...@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 0fdfde3dcb9f..59688f8ae919 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5220,10 +5220,9 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
> notify_clients)
>
>         amdgpu_device_ip_suspend_phase1(adev);
>
> -       if (!adev->in_s0ix) {
> -               amdgpu_amdkfd_suspend(adev, !amdgpu_sriov_vf(adev) && 
> !adev->in_runpm);
> +       amdgpu_amdkfd_suspend(adev, !amdgpu_sriov_vf(adev) && 
> !adev->in_runpm);
> +       if (!adev->in_s0ix)
>                 amdgpu_userq_suspend(adev);

KGD user queues have the same requirements as KFD user queues so this
should be called as well.

> -       }
>
>         r = amdgpu_device_evict_resources(adev);
>         if (r)
> @@ -5318,11 +5317,11 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
> notify_clients)
>                 goto exit;
>         }
>
> -       if (!adev->in_s0ix) {
> -               r = amdgpu_amdkfd_resume(adev, !amdgpu_sriov_vf(adev) && 
> !adev->in_runpm);
> -               if (r)
> -                       goto exit;
> +       r = amdgpu_amdkfd_resume(adev, !amdgpu_sriov_vf(adev) && 
> !adev->in_runpm);
> +       if (r)
> +               goto exit;
>
> +       if (!adev->in_s0ix) {
>                 r = amdgpu_userq_resume(adev);

Same here.

Alex

>                 if (r)
>                         goto exit;
> --
> 2.50.1
>

Reply via email to