Sometimes engine reset fails because the engine resumes from an incorrect RING_HEAD. Engine head failed to set to zero even after writing into it. This is a timing issue and we experimented different values and found out that 20ms delay works best based on testing.
So, add a 20ms delay to let engine resumes from correct RING_HEAD. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13968 Signed-off-by: Nitin Gote <nitin.r.g...@intel.com> --- Hi, Here, using wait_for_atomic() instead of any delay functions like udelay/mdelay/flseep to avoid error "BUG: scheduling while atomic", which observed during testing. -Nitin drivers/gpu/drm/i915/gt/intel_ring_submission.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 6e9977b2d180..a876a34455f1 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -365,7 +365,13 @@ static void reset_prepare(struct intel_engine_cs *engine) ENGINE_READ_FW(engine, RING_HEAD), ENGINE_READ_FW(engine, RING_TAIL), ENGINE_READ_FW(engine, RING_START)); - if (!stop_ring(engine)) { + /* + * Sometimes engine head failed to set to zero even after writing into it. + * Use wait_for_atomic() with 20ms delay to let engine resumes from + * correct RING_HEAD. Experimented different values and determined + * that 20ms works best based on testing. + */ + if (wait_for_atomic((!stop_ring(engine) == 0), 20)) { drm_err(&engine->i915->drm, "failed to set %s head to zero " "ctl %08x head %08x tail %08x start %08x\n", -- 2.25.1