Re: [Intel-gfx] [PATCH] drm/i915: Clear lost context-switch interrupts across reset

2017-08-09 Thread Chris Wilson
Quoting Dong, Chuanxiao (2017-08-09 01:26:00)
> > -Original Message-
> > From: Thierry, Michel
> > Sent: Tuesday, August 8, 2017 1:40 AM
> > To: Chris Wilson ; intel-gfx@lists.freedesktop.org
> > Cc: Dong, Chuanxiao ; Ursulin, Tvrtko
> > ; Winiarski, Michal 
> > Subject: Re: [PATCH] drm/i915: Clear lost context-switch interrupts across
> > reset
> > 
> > On 8/7/2017 5:19 AM, Chris Wilson wrote:
> > > During a global reset, we disable the irq. As we disable the irq, the
> > > hardware may be raising a GT interrupt that we then ignore, leaving it
> > > pending in the GTIIR. After the reset, we then re-enable the irq,
> > > triggering the pending interrupt. However, that interrupt was for the
> > > stale state from before the reset, and the contents of the CSB buffer
> > > are now invalid.
> > >
> > > Reported-by: "Dong, Chuanxiao" 
> > > Signed-off-by: Chris Wilson 
> > > Cc: "Dong, Chuanxiao" 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Michal Winiarski 
> > > Cc: Michel Thierry 
> > > ---
> > >   drivers/gpu/drm/i915/intel_lrc.c | 17 -
> > >   1 file changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > > b/drivers/gpu/drm/i915/intel_lrc.c
> > > index b0738d2b2a7f..bc61948e2601 100644
> > > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > > @@ -1221,6 +1221,14 @@ static int intel_init_workaround_bb(struct
> > intel_engine_cs *engine)
> > > return ret;
> > >   }
> > >
> > > +static u8 gtiir[] = {
> > > +   [RCS] = 0,
> > > +   [BCS] = 0,
> > > +   [VCS] = 1,
> > > +   [VCS2] = 1,
> > > +   [VECS] = 3,
> > > +};
> > > +
> > >   static int gen8_init_common_ring(struct intel_engine_cs *engine)
> > >   {
> > > struct drm_i915_private *dev_priv = engine->i915; @@ -1245,9
> > > +1253,16 @@ static int gen8_init_common_ring(struct intel_engine_cs
> > > *engine)
> > >
> > > DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
> > >
> > > -   /* After a GPU reset, we may have requests to replay */
> > > +   GEM_BUG_ON(engine->id >= ARRAY_SIZE(gtiir));
> > > +
> > > +   /* Clear any pending interrupt state */
> > > +   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
> > > +  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
> > > +   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
> > > +  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
> > 
> > Clear twice? Or the second was supposed to be user_interrupt?

I never know which of the IIR are double buffered, so in case they
decide to double buffer GTIIR, I cleared it twice.

We do want to service the user_interrupt even after the hang in case we
don't notify it during reset processing. Might be worth kicking the
engine just in case, but we have contingency plans.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Clear lost context-switch interrupts across reset

2017-08-08 Thread Dong, Chuanxiao
> -Original Message-
> From: Thierry, Michel
> Sent: Tuesday, August 8, 2017 1:40 AM
> To: Chris Wilson ; intel-gfx@lists.freedesktop.org
> Cc: Dong, Chuanxiao ; Ursulin, Tvrtko
> ; Winiarski, Michal 
> Subject: Re: [PATCH] drm/i915: Clear lost context-switch interrupts across
> reset
> 
> On 8/7/2017 5:19 AM, Chris Wilson wrote:
> > During a global reset, we disable the irq. As we disable the irq, the
> > hardware may be raising a GT interrupt that we then ignore, leaving it
> > pending in the GTIIR. After the reset, we then re-enable the irq,
> > triggering the pending interrupt. However, that interrupt was for the
> > stale state from before the reset, and the contents of the CSB buffer
> > are now invalid.
> >
> > Reported-by: "Dong, Chuanxiao" 
> > Signed-off-by: Chris Wilson 
> > Cc: "Dong, Chuanxiao" 
> > Cc: Tvrtko Ursulin 
> > Cc: Michal Winiarski 
> > Cc: Michel Thierry 
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 17 -
> >   1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index b0738d2b2a7f..bc61948e2601 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -1221,6 +1221,14 @@ static int intel_init_workaround_bb(struct
> intel_engine_cs *engine)
> > return ret;
> >   }
> >
> > +static u8 gtiir[] = {
> > +   [RCS] = 0,
> > +   [BCS] = 0,
> > +   [VCS] = 1,
> > +   [VCS2] = 1,
> > +   [VECS] = 3,
> > +};
> > +
> >   static int gen8_init_common_ring(struct intel_engine_cs *engine)
> >   {
> > struct drm_i915_private *dev_priv = engine->i915; @@ -1245,9
> > +1253,16 @@ static int gen8_init_common_ring(struct intel_engine_cs
> > *engine)
> >
> > DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
> >
> > -   /* After a GPU reset, we may have requests to replay */
> > +   GEM_BUG_ON(engine->id >= ARRAY_SIZE(gtiir));
> > +
> > +   /* Clear any pending interrupt state */
> > +   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
> > +  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
> > +   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
> > +  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
> 
> Clear twice? Or the second was supposed to be user_interrupt?

To resolve the CSB tail/head trouble, clearing context switching pending 
interrupt only should be enough, but seems it is not necessary to keep the 
pending user interrupt as well so we can clear user interrupt as well.
Chris, what do you think? And thank you for bringing the fix for this issue. :)

Thanks
Chuanxiao

> 
> > clear_bit(ENGINE_IRQ_EXECLIST, >irq_posted);
> >
> > +   /* After a GPU reset, we may have requests to replay */
> > submit = false;
> > for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
> > if (!port_isset([n]))
> >
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Clear lost context-switch interrupts across reset

2017-08-07 Thread Michel Thierry

On 8/7/2017 5:19 AM, Chris Wilson wrote:

During a global reset, we disable the irq. As we disable the irq, the
hardware may be raising a GT interrupt that we then ignore, leaving it
pending in the GTIIR. After the reset, we then re-enable the irq,
triggering the pending interrupt. However, that interrupt was for the
stale state from before the reset, and the contents of the CSB buffer
are now invalid.

Reported-by: "Dong, Chuanxiao" 
Signed-off-by: Chris Wilson 
Cc: "Dong, Chuanxiao" 
Cc: Tvrtko Ursulin 
Cc: Michal Winiarski 
Cc: Michel Thierry 
---
  drivers/gpu/drm/i915/intel_lrc.c | 17 -
  1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b0738d2b2a7f..bc61948e2601 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1221,6 +1221,14 @@ static int intel_init_workaround_bb(struct 
intel_engine_cs *engine)
return ret;
  }
  
+static u8 gtiir[] = {

+   [RCS] = 0,
+   [BCS] = 0,
+   [VCS] = 1,
+   [VCS2] = 1,
+   [VECS] = 3,
+};
+
  static int gen8_init_common_ring(struct intel_engine_cs *engine)
  {
struct drm_i915_private *dev_priv = engine->i915;
@@ -1245,9 +1253,16 @@ static int gen8_init_common_ring(struct intel_engine_cs 
*engine)
  
  	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", engine->name);
  
-	/* After a GPU reset, we may have requests to replay */

+   GEM_BUG_ON(engine->id >= ARRAY_SIZE(gtiir));
+
+   /* Clear any pending interrupt state */
+   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
+  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);
+   I915_WRITE(GEN8_GT_IIR(gtiir[engine->id]),
+  GT_CONTEXT_SWITCH_INTERRUPT << engine->irq_shift);


Clear twice? Or the second was supposed to be user_interrupt?


clear_bit(ENGINE_IRQ_EXECLIST, >irq_posted);
  
+	/* After a GPU reset, we may have requests to replay */

submit = false;
for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
if (!port_isset([n]))


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx