To make wedging even more likely, we use a new "no recovery" context
parameter that tells the kernel to not even attempt to replay any
batches in flight against the default context image, as experience shows
the HW is not always robust enough to cope with the conflicting state.
This allows us to always take over responsibility of rebuilding the lost
context following a GPU hang.

Cc: Kenneth Graunke <kenn...@whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 1248f8b9fa4..a0cbc315b60 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -1589,6 +1589,28 @@ init_cache_buckets(struct brw_bufmgr *bufmgr)
    }
 }
 
+static void init_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id)
+{
+   /*
+    * Upon declaring a GPU hang, the kernel will zap the guilty context
+    * back to the default logical HW state and attempt to continue on to
+    * our next submitted batchbuffer. However, we only send incremental
+    * logical state (e.g. we only ever setup invariant register state
+    * once in brw_initial_gpu_upload()) and so attempting to reply the
+    * next batchbuffer without the correct logical state can be fatal.
+    * Here we tell the kernel not to attempt to recover our context but
+    * immediately (on the next batchbuffer submission) report that the
+    * context is lost, and we will do the recovery ourselves -- 2 lost
+    * batches instead of a continual stream until we are banned, or the
+    * machine is dead.
+    */
+   struct drm_i915_gem_context_param p = {
+      .ctx_id = ctx_id,
+      .param = I915_CONTEXT_PARAM_RECOVERABLE,
+   };
+   drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &p);
+}
+
 uint32_t
 brw_create_hw_context(struct brw_bufmgr *bufmgr)
 {
@@ -1599,6 +1621,8 @@ brw_create_hw_context(struct brw_bufmgr *bufmgr)
       return 0;
    }
 
+   init_context(bufmgr, create.ctx_id);
+
    return create.ctx_id;
 }
 
-- 
2.20.1

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to