On 16/02/2017 10:47, Chris Wilson wrote:
On Thu, Feb 16, 2017 at 10:38:08AM +0000, Tvrtko Ursulin wrote:

On 16/02/2017 09:29, Chris Wilson wrote:
If the timer expires for enabling the fake interrupt, check to see
if there is a real interrupt queued before making the decision to start
polling. This helps in situations where we have a very slow irq-seqno
barrier that may accrue more breadcrumb interrupts before it is able to
catch up. It still leaves a hole for the timer to expire as we are
processing the last irq-seqno barrier, but it appears to help reduce the
frequency of "missed-interrupts" on Ironlake, at least.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
drivers/gpu/drm/i915/intel_breadcrumbs.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
index ef3adfd37d7d..21269421bd2a 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -41,6 +41,11 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)

+       if (test_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
+               mod_timer(&b->hangcheck, jiffies + 1);
+               return;
+       }
        DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
        set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
        mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);

Another hmm :), barriers are so much shorter than the hangcheck
interval (1500ms) so I don't quite understand what is the problem.

We may queue the wait many, many seconds before it is even being
processed. The point of this timer is to detect when we haven't seen an
interrupt for some time and that happens to be the waiter missed (backup
for seqno-barrier failing).

But it is the same as a slow batch which completes just as the hangcheck timer fires.

Is the problem is seqno barrier on some platforms can slow down the signaller thread a lot? Hm, if the request duration is right it would sleep on every invocation. So in effect extend the request duration as seen form userspace by the barrier duration. Why would that be a problem though? Haven't fully figured it out.

Should we instead put a mod_timer when raising the user irq?

Frequency, mod_timer may require reprogramming of the apic and often
does when profiling ;)

Agreed on this one so no complaints, just trying to understand the precise problem.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com>


Intel-gfx mailing list

Reply via email to