On Fri, Feb 17, 2017 at 10:48:50AM +0000, Tvrtko Ursulin wrote:
> 
> On 17/02/2017 10:18, Chris Wilson wrote:
> >If the waiter was currently running, assume it hasn't had a chance
> >to process the pending interupt (e.g, low priority task on a loaded
> >system) and wait until it sleeps before declaring a missed interrupt.
> >
> >References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
> >Signed-off-by: Chris Wilson <[email protected]>
> >Cc: Tvrtko Ursulin <[email protected]>
> >Cc: Mika Kuoppala <[email protected]>
> >---
> > drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> >b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >index 4395b177493e..2ad29fb77b2d 100644
> >--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >@@ -45,6 +45,15 @@ static void intel_breadcrumbs_hangcheck(unsigned long 
> >data)
> >             return;
> >     }
> >
> >+    /* If the waiter was currently running, assume it hasn't had a chance
> >+     * to process the pending interupt (e.g, low priority task on a loaded
> >+     * system) and wait until it sleeps before declaring a missed interrupt.
> >+     */
> >+    if (!intel_engine_wakeup(engine)) {
> >+            mod_timer(&b->hangcheck, wait_timeout());
> >+            return;
> >+    }
> >+
> >     DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
> >     set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
> >     mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
> >
> 
> Change here is that we would never declare a GPU hang is userspace
> would just wait indefinitely, or in other words with this patch we
> would rely on userspace timing out on their waits in order to
> declare a hang.

Surely you mean the other way around? The only way we get to now declare a
missed-interrupt and then queue a hangcheck here is if userspace sleeps.

> Hm, in fact even with the current code, if the userspace keeps
> exiting and re-entering the wait we would be re-arming the hangcheck
> timer and so also never notice a GPU hang.

Correct. It is not the only way we arm the GPU hangcheck.
gem_busy/hang, gem_wait/busy-hang check that we do detect hangs even if
userspace never sleeps.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to