Reduce the window of opportunity for set-wedged being called
concurrently with reset (after i915_reset() has performed the
i915_gem_unset_wedged()) by moving the set_bit(I915_WEDGED) to before we
complete the inflight requests. When i915_reset() is being blocked on a
request, such completion may allow it to start and beginning resetting
the GPU before i915_gem_set_wedged() has finished (and so before
set-wedge will have marked the device as wedged). As such,
i915_gem_init_hw() may see a wedged device even from inside
i915_reset().

References: 36703e79a982 ("drm/i915: Break modeset deadlocks on reset")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
---
 drivers/gpu/drm/i915/i915_gem.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c1b80cd52f9e..06f0456699af 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3205,6 +3205,9 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
                        intel_engine_dump(engine, &p, "%s\n", engine->name);
        }
 
+       set_bit(I915_WEDGED, &i915->gpu_error.flags);
+       smp_mb__after_atomic();
+
        /*
         * First, stop submission to hw, but do not yet complete requests by
         * rolling the global seqno forward (since this would complete requests
@@ -3241,7 +3244,8 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
        for_each_engine(engine, i915, id) {
                unsigned long flags;
 
-               /* Mark all pending requests as complete so that any concurrent
+               /*
+                * Mark all pending requests as complete so that any concurrent
                 * (lockless) lookup doesn't try and wait upon the request as we
                 * reset it.
                 */
@@ -3251,7 +3255,6 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
                spin_unlock_irqrestore(&engine->timeline->lock, flags);
        }
 
-       set_bit(I915_WEDGED, &i915->gpu_error.flags);
        wake_up_all(&i915->gpu_error.reset_queue);
 }
 
-- 
2.16.1

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to