Re: [Intel-gfx] [PATCH 2/2] drm/i915: Postpone fake breadcrumb interrupt if a real interrupt is pending

2017-02-16 Thread Chris Wilson
On Thu, Feb 16, 2017 at 11:17:11AM +, Tvrtko Ursulin wrote:
> 
> On 16/02/2017 10:47, Chris Wilson wrote:
> >On Thu, Feb 16, 2017 at 10:38:08AM +, Tvrtko Ursulin wrote:
> >>
> >>On 16/02/2017 09:29, Chris Wilson wrote:
> >>>If the timer expires for enabling the fake interrupt, check to see
> >>>if there is a real interrupt queued before making the decision to start
> >>>polling. This helps in situations where we have a very slow irq-seqno
> >>>barrier that may accrue more breadcrumb interrupts before it is able to
> >>>catch up. It still leaves a hole for the timer to expire as we are
> >>>processing the last irq-seqno barrier, but it appears to help reduce the
> >>>frequency of "missed-interrupts" on Ironlake, at least.
> >>>
> >>>References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
> >>>Signed-off-by: Chris Wilson 
> >>>Cc: Tvrtko Ursulin 
> >>>---
> >>>drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +
> >>>1 file changed, 5 insertions(+)
> >>>
> >>>diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> >>>b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >>>index ef3adfd37d7d..21269421bd2a 100644
> >>>--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >>>+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >>>@@ -41,6 +41,11 @@ static void intel_breadcrumbs_hangcheck(unsigned long 
> >>>data)
> >>>   return;
> >>>   }
> >>>
> >>>+  if (test_bit(ENGINE_IRQ_BREADCRUMB, >irq_posted)) {
> >>>+  mod_timer(>hangcheck, jiffies + 1);
> >>>+  return;
> >>>+  }
> >>>+
> >>>   DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
> >>>   set_bit(engine->id, >i915->gpu_error.missed_irq_rings);
> >>>   mod_timer(>breadcrumbs.fake_irq, jiffies + 1);
> >>>
> >>
> >>Another hmm :), barriers are so much shorter than the hangcheck
> >>interval (1500ms) so I don't quite understand what is the problem.
> >
> >We may queue the wait many, many seconds before it is even being
> >processed. The point of this timer is to detect when we haven't seen an
> >interrupt for some time and that happens to be the waiter missed (backup
> >for seqno-barrier failing).
> 
> But it is the same as a slow batch which completes just as the
> hangcheck timer fires.

Yes.

> Is the problem is seqno barrier on some platforms can slow down the
> signaller thread a lot?

Yes. Waiting for the seqno to become visible appears unbounded...

> Hm, if the request duration is right it
> would sleep on every invocation. So in effect extend the request
> duration as seen form userspace by the barrier duration. Why would
> that be a problem though? Haven't fully figured it out.

It's only a problem if we get interrupts in that case and no seqno
advance. But as we are getting the interrupts, we are at least checking
the seqno and doing a timer wakeup is not going to progress the waiter.

The premise behind the timer is that we will wakeup after an interrupt
and not see the seqno advance. (Or even worse, never see an interrupt!)
Only when the interrupts stop, and the waiter stops checking, do we need
to intervene.

I've rewritten this to explicitly check for stopped interrupts,
hopefully that makes this backup strategy clear.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Postpone fake breadcrumb interrupt if a real interrupt is pending

2017-02-16 Thread Tvrtko Ursulin


On 16/02/2017 10:47, Chris Wilson wrote:

On Thu, Feb 16, 2017 at 10:38:08AM +, Tvrtko Ursulin wrote:


On 16/02/2017 09:29, Chris Wilson wrote:

If the timer expires for enabling the fake interrupt, check to see
if there is a real interrupt queued before making the decision to start
polling. This helps in situations where we have a very slow irq-seqno
barrier that may accrue more breadcrumb interrupts before it is able to
catch up. It still leaves a hole for the timer to expire as we are
processing the last irq-seqno barrier, but it appears to help reduce the
frequency of "missed-interrupts" on Ironlake, at least.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index ef3adfd37d7d..21269421bd2a 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -41,6 +41,11 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
return;
}

+   if (test_bit(ENGINE_IRQ_BREADCRUMB, >irq_posted)) {
+   mod_timer(>hangcheck, jiffies + 1);
+   return;
+   }
+
DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
set_bit(engine->id, >i915->gpu_error.missed_irq_rings);
mod_timer(>breadcrumbs.fake_irq, jiffies + 1);



Another hmm :), barriers are so much shorter than the hangcheck
interval (1500ms) so I don't quite understand what is the problem.


We may queue the wait many, many seconds before it is even being
processed. The point of this timer is to detect when we haven't seen an
interrupt for some time and that happens to be the waiter missed (backup
for seqno-barrier failing).


But it is the same as a slow batch which completes just as the hangcheck 
timer fires.


Is the problem is seqno barrier on some platforms can slow down the 
signaller thread a lot? Hm, if the request duration is right it would 
sleep on every invocation. So in effect extend the request duration as 
seen form userspace by the barrier duration. Why would that be a problem 
though? Haven't fully figured it out.



Should we instead put a mod_timer when raising the user irq?


Frequency, mod_timer may require reprogramming of the apic and often
does when profiling ;)


Agreed on this one so no complaints, just trying to understand the 
precise problem.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Postpone fake breadcrumb interrupt if a real interrupt is pending

2017-02-16 Thread Chris Wilson
On Thu, Feb 16, 2017 at 10:38:08AM +, Tvrtko Ursulin wrote:
> 
> On 16/02/2017 09:29, Chris Wilson wrote:
> >If the timer expires for enabling the fake interrupt, check to see
> >if there is a real interrupt queued before making the decision to start
> >polling. This helps in situations where we have a very slow irq-seqno
> >barrier that may accrue more breadcrumb interrupts before it is able to
> >catch up. It still leaves a hole for the timer to expire as we are
> >processing the last irq-seqno barrier, but it appears to help reduce the
> >frequency of "missed-interrupts" on Ironlake, at least.
> >
> >References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
> >Signed-off-by: Chris Wilson 
> >Cc: Tvrtko Ursulin 
> >---
> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +
> > 1 file changed, 5 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> >b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >index ef3adfd37d7d..21269421bd2a 100644
> >--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >@@ -41,6 +41,11 @@ static void intel_breadcrumbs_hangcheck(unsigned long 
> >data)
> > return;
> > }
> >
> >+if (test_bit(ENGINE_IRQ_BREADCRUMB, >irq_posted)) {
> >+mod_timer(>hangcheck, jiffies + 1);
> >+return;
> >+}
> >+
> > DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
> > set_bit(engine->id, >i915->gpu_error.missed_irq_rings);
> > mod_timer(>breadcrumbs.fake_irq, jiffies + 1);
> >
> 
> Another hmm :), barriers are so much shorter than the hangcheck
> interval (1500ms) so I don't quite understand what is the problem.

We may queue the wait many, many seconds before it is even being
processed. The point of this timer is to detect when we haven't seen an
interrupt for some time and that happens to be the waiter missed (backup
for seqno-barrier failing).
 
> Should we instead put a mod_timer when raising the user irq?

Frequency, mod_timer may require reprogramming of the apic and often
does when profiling ;)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Postpone fake breadcrumb interrupt if a real interrupt is pending

2017-02-16 Thread Tvrtko Ursulin


On 16/02/2017 09:29, Chris Wilson wrote:

If the timer expires for enabling the fake interrupt, check to see
if there is a real interrupt queued before making the decision to start
polling. This helps in situations where we have a very slow irq-seqno
barrier that may accrue more breadcrumb interrupts before it is able to
catch up. It still leaves a hole for the timer to expire as we are
processing the last irq-seqno barrier, but it appears to help reduce the
frequency of "missed-interrupts" on Ironlake, at least.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index ef3adfd37d7d..21269421bd2a 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -41,6 +41,11 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
return;
}

+   if (test_bit(ENGINE_IRQ_BREADCRUMB, >irq_posted)) {
+   mod_timer(>hangcheck, jiffies + 1);
+   return;
+   }
+
DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
set_bit(engine->id, >i915->gpu_error.missed_irq_rings);
mod_timer(>breadcrumbs.fake_irq, jiffies + 1);



Another hmm :), barriers are so much shorter than the hangcheck interval 
(1500ms) so I don't quite understand what is the problem.


Should we instead put a mod_timer when raising the user irq?

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx