On 12/5/2023 00:52, Nirmoy Das wrote:
gen8_engine_reset_prepare() can fail when HW fails to set
RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
error as driver will retry.

Convert the log to a trace log for debugging without triggering
unnecessary concerns in CI or for end-users during non-fatal scenarios.
I strongly disagree with this change. The hardware spec for the RESET_CTL and GDRST registers are that they will self clear within a matter of microseconds. If something is so badly wrong with the hardware that it can't even manage to reset then that is something that very much warrants more than a completely silent trace event. It most certainly should be flagged as a failure in CI.

Just because the driver will retry does not mean that this is not a serious error. And if the first attempt failed, why would a subsequent attempt succeed? Escalating to FLR may have more success, but that is not something that i915 currently does.

John.



v2: Improve commit message(Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
Cc: John Harrison <john.c.harri...@intel.com>
Cc: Andi Shyti <andi.sh...@linux.intel.com>
Cc: Andrzej Hajda <andrzej.ha...@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
Signed-off-by: Nirmoy Das <nirmoy....@intel.com>
Reviewed-by: Andi Shyti <andi.sh...@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.ha...@intel.com>
---
  drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index d5ed904f355d..e6fbc6202c80 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct 
intel_engine_cs *engine)
        ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
                                           700, 0, NULL);
        if (ret)
-               gt_err(engine->gt,
-                      "%s reset request timed out: {request: %08x, RESET_CTL: 
%08x}\n",
-                      engine->name, request,
-                      intel_uncore_read_fw(uncore, reg));
+               GT_TRACE(engine->gt,
+                        "%s reset request timed out: {request: %08x, RESET_CTL: 
%08x}\n",
+                        engine->name, request,
+                        intel_uncore_read_fw(uncore, reg));
return ret;
  }

Reply via email to