On Tue, Sep 16, 2025 at 4:08 AM Jesse.Zhang <[email protected]> wrote:
>
> From: Lijo Lazar <[email protected]>
>
> Add a fallback mechanism to attempt pipe reset when KCQ reset
> fails to recover the ring. After performing the KCQ reset and
> queue remapping, test the ring functionality. If the ring test
> fails, initiate a pipe reset as an additional recovery step.
>
> v2: fix the typo (Lijo)
> v3: try pipeline reset when kiq mapping fails (Lijo)
>
> Signed-off-by: Lijo Lazar <[email protected]>
> Signed-off-by: Jesse Zhang <[email protected]>

Reviewed-by: Alex Deucher <[email protected]>

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> index 8ba66d4dfe86..77f9d5b9a556 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
> @@ -3560,6 +3560,7 @@ static int gfx_v9_4_3_reset_kcq(struct amdgpu_ring 
> *ring,
>         struct amdgpu_device *adev = ring->adev;
>         struct amdgpu_kiq *kiq = &adev->gfx.kiq[ring->xcc_id];
>         struct amdgpu_ring *kiq_ring = &kiq->ring;
> +       int reset_mode = AMDGPU_RESET_TYPE_PER_QUEUE;
>         unsigned long flags;
>         int r;
>
> @@ -3597,6 +3598,7 @@ static int gfx_v9_4_3_reset_kcq(struct amdgpu_ring 
> *ring,
>                 if (!(adev->gfx.compute_supported_reset & 
> AMDGPU_RESET_TYPE_PER_PIPE))
>                         return -EOPNOTSUPP;
>                 r = gfx_v9_4_3_reset_hw_pipe(ring);
> +               reset_mode = AMDGPU_RESET_TYPE_PER_PIPE;
>                 dev_info(adev->dev, "ring: %s pipe reset :%s\n", ring->name,
>                                 r ? "failed" : "successfully");
>                 if (r)
> @@ -3619,10 +3621,20 @@ static int gfx_v9_4_3_reset_kcq(struct amdgpu_ring 
> *ring,
>         r = amdgpu_ring_test_ring(kiq_ring);
>         spin_unlock_irqrestore(&kiq->ring_lock, flags);
>         if (r) {
> +               if (reset_mode == AMDGPU_RESET_TYPE_PER_QUEUE)
> +                       goto pipe_reset;
> +
>                 dev_err(adev->dev, "fail to remap queue\n");
>                 return r;
>         }
>
> +       if (reset_mode == AMDGPU_RESET_TYPE_PER_QUEUE) {
> +               r = amdgpu_ring_test_ring(ring);
> +               if (r)
> +                       goto pipe_reset;
> +       }
> +
> +
>         return amdgpu_ring_reset_helper_end(ring, timedout_fence);
>  }
>
> --
> 2.49.0
>

Reply via email to