Chris Wilson <[email protected]> writes:

> If the request->wa_tail is 0 (because it landed exactly on the end of
> the ringbuffer), when we reconstruct request->tail following a reset we
> fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> never able to catch up with RING_TAIL and the GPU spins endlessly. If
> the ring contains a couple of breadcrumbs, even our hangcheck is unable
> to catch the busy-looping as the ACTHD and seqno continually advance.

Tail is past ring size (on hw) and the ring contents has seqno writes.
So we will replay the ring contents over and over and seqno advances
and wraps back to the first breadcrumbs in ring?

> Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after 
> GPU hang")
> Signed-off-by: Chris Wilson <[email protected]>
> Cc: Mika Kuoppala <[email protected]>
> Cc: <[email protected]> # v4.10+
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 3fdabba0a32d..c8dd848d2ebe 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs 
> *engine,
>  
>       /* Reset WaIdleLiteRestore:bdw,skl as well */
>       request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
> +     request->tail &= request->ring->size - 1;
>       GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
>  }
>  
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to