Reviewed-by: Andrey Grodzovsky <andrey.grodzov...@amd.com>

Andrey

On 1/19/21 7:22 AM, Horace Chen wrote:
If 2 jobs on 2 different ring timed out the at a very short
period, the reset for second job will be skipped because the
reset is already in progress.

But it doesn't mean the second job is not guilty since it
also timed out and can be a bad job. So before skipped out
from the reset, we need to increase karma for this job too.

Signed-off-by: Horace Chen <horace.c...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9574da3abc32..1d6ff9fe37de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4574,6 +4574,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
                        DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as 
another already in progress",
                                job ? job->base.id : -1, hive->hive_id);
                        amdgpu_put_xgmi_hive(hive);
+                       if (job)
+                               drm_sched_increase_karma(&job->base);
                        return 0;
                }
                mutex_lock(&hive->hive_lock);
@@ -4617,6 +4619,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
                                        job ? job->base.id : -1);
                r = 0;
                /* even we skipped this reset, still need to set the job to 
guilty */
+               if (job)
+                       drm_sched_increase_karma(&job->base);
                goto skip_recovery;
        }
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to