[AMD Public Use]

> -----Original Message-----
> From: Tuikov, Luben <luben.tui...@amd.com>
> Sent: Wednesday, May 12, 2021 1:03 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Tuikov, Luben <luben.tui...@amd.com>; Deucher, Alexander
> <alexander.deuc...@amd.com>; sta...@vger.kernel.org
> Subject: [PATCH 1/2] drm/amdgpu: Don't query CE and UE errors
> 
> On QUERY2 IOCTL don't query counts of correctable and uncorrectable
> errors, since when RAS is enabled and supported on Vega20 server boards,
> this takes insurmountably long time, in O(n^3), which slows the system down
> to the point of it being unusable when we have GUI up.
> 
> Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to
> AMDGPU_CTX_OP_QUERY_STATE2")
> Cc: Alexander Deucher <alexander.deuc...@amd.com>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Luben Tuikov <luben.tui...@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 26 ++++++++++++-----------
> --
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index 01fe60fedcbe..d481a33f4eaf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -363,19 +363,19 @@ static int amdgpu_ctx_query2(struct
> amdgpu_device *adev,
>               out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY;
> 
>       /*query ue count*/
> -     ras_counter = amdgpu_ras_query_error_count(adev, false);
> -     /*ras counter is monotonic increasing*/
> -     if (ras_counter != ctx->ras_counter_ue) {
> -             out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE;
> -             ctx->ras_counter_ue = ras_counter;
> -     }
> -
> -     /*query ce count*/
> -     ras_counter = amdgpu_ras_query_error_count(adev, true);
> -     if (ras_counter != ctx->ras_counter_ce) {
> -             out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE;
> -             ctx->ras_counter_ce = ras_counter;
> -     }
> +     /* ras_counter = amdgpu_ras_query_error_count(adev, false); */
> +     /* /\*ras counter is monotonic increasing*\/ */
> +     /* if (ras_counter != ctx->ras_counter_ue) { */
> +     /*      out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE;
> */
> +     /*      ctx->ras_counter_ue = ras_counter; */
> +     /* } */
> +
> +     /* /\*query ce count*\/ */
> +     /* ras_counter = amdgpu_ras_query_error_count(adev, true); */
> +     /* if (ras_counter != ctx->ras_counter_ce) { */
> +     /*      out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE;
> */
> +     /*      ctx->ras_counter_ce = ras_counter; */
> +     /* } */
> 

Rather than commenting this out, just drop it in patch 1, and then re-add this 
in patch 2.

Alex

>       mutex_unlock(&mgr->lock);
>       return 0;
> --
> 2.31.1.527.g2d677e5b15
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to