If the request->wa_tail is 0 (because it landed exactly on the end of
the ringbuffer), when we reconstruct request->tail following a reset we
fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
never able to catch up with RING_TAIL and the GPU spins endlessly. If
the ring contains a couple of breadcrumbs, even our hangcheck is unable
to catch the busy-looping as the ACTHD and seqno continually advance.
Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU
hang")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: <[email protected]> # v4.10+
---
drivers/gpu/drm/i915/intel_lrc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3fdabba0a32d..c8dd848d2ebe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs
*engine,
/* Reset WaIdleLiteRestore:bdw,skl as well */
request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
+ request->tail &= request->ring->size - 1;
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
}
--
2.11.0
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx