Quoting Chris Wilson (2018-03-31 12:00:16)
> Quoting Kenneth Graunke (2018-03-30 19:20:57)
> > On Friday, March 30, 2018 7:40:13 AM PDT Chris Wilson wrote:
> > > For i915, we are proposing to use a quality-of-service parameter in
> > > addition to that of just a priority that usurps everyone. Due to our HW,
> > > preemption may not be immediate and will be forced to wait until an
> > > uncooperative process hits an arbitration point. To prevent that unduly
> > > impacting the privileged RealTime context, we back up the preemption
> > > request with a timeout to reset the GPU and forcibly evict the GPU hog
> > > in order to execute the new context.
> > 
> > I am strongly against exposing this in general.  Performing a GPU reset
> > in the middle of a batch can completely screw up whatever application
> > was running.  If the application is using robustness extensions, we may
> > be forced to return GL_DEVICE_LOST, causing the application to have to
> > recreate their entire GL context and start over.  If not, we may try to
> > let them limp on(*) - and hope they didn't get too badly damaged by some
> > of their commands not executing, or executing twice (if the kernel tries
> > to resubmit it).  But it may very well cause the app to misrender, or
> > even crash.
> 
> Yes, I think the revulsion has been universal. However, as a
> quality-of-service guarantee, I can understand the appeal. The
> difference is that instead of allowing a DoS for 6s or so as we
> currently allow, we allow that to be specified by the context. As it
> does allow one context to impact another, I want it locked down to
> privileged processes. I have been using CAP_SYS_ADMIN as the potential
> to do harm is even greater than exploiting the weak scheduler by
> changing priority.

Also to add further insult to injury, we might want to force GPU clocks
to max for the RT context (so that the context starts executing at max
rather than wait for the system to upclock on load). Something like,

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index b080c4c58f1..461b76b64c9 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -1370,6 +1370,36 @@ brw_hw_context_set_preempt_timeout(struct brw_bufmgr 
*bufmgr,
    return err;
 }
 
+int
+brw_hw_context_force_maximum_frequency(struct brw_bufmgr *bufmgr,
+                                      uint32_t ctx_id)
+{
+#define I915_CONTEXT_PARAM_FREQUENCY    0x8
+#define   I915_CONTEXT_MIN_FREQUENCY(x) ((x) & 0xffffffff)
+#define   I915_CONTEXT_MAX_FREQUENCY(x) ((x) >> 32)
+#define   I915_CONTEXT_SET_FREQUENCY(min, max) ((uint64_t)(max) << 32 | (min))
+
+   struct drm_i915_gem_context_param p = {
+      .ctx_id = ctx_id,
+      .param =I915_CONTEXT_PARAM_FREQUENCY,
+   };
+
+   /* First find the HW limits */
+   if (drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &p))
+      return -errno;
+
+   /* Then specify that the context's minimum frequency is the HW max,
+    * forcing the context to only run at the maximum frequency, as
+    * restricted by the global user limits.
+    */
+   p.value = I915_CONTEXT_SET_FREQUENCY(I915_CONTEXT_MAX_FREQUENCY(p.value),
+                                       I915_CONTEXT_MAX_FREQUENCY(p.value));
+   if (drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &p))
+      return -errno;
+
+   return 0;
+}
+
 void
 brw_destroy_hw_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index a493b7018af..07dc9ced57a 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -320,6 +320,9 @@ int brw_hw_context_set_preempt_timeout(struct brw_bufmgr 
*bufmgr,
                                       uint32_t ctx_id,
                                       uint64_t timeout_ns);
 
+int brw_hw_context_force_maximum_frequency(struct brw_bufmgr *bufmgr,
+                                           uint32_t ctx_id);
+
 void brw_destroy_hw_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id);
 
 int brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd);
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9b84a29d4a2..0bd965043c5 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1026,13 +1026,17 @@ brwCreateContext(gl_api api,
          intelDestroyContext(driContextPriv);
          return false;
       }
-      if (hw_priority >= GEN_CONTEXT_REALTIME_PRIORITY &&
-          brw_hw_context_set_preempt_timeout(brw->bufmgr, brw->hw_ctx,
-                                            8 * 1000 * 1000 /* 8ms */)) {
-         fprintf(stderr,
-                "Failed to set preempt timeout for RT hardware context.\n");
-         intelDestroyContext(driContextPriv);
-         return false;
+
+      if (hw_priority >= GEN_CONTEXT_REALTIME_PRIORITY) {
+          if (brw_hw_context_set_preempt_timeout(brw->bufmgr, brw->hw_ctx,
+                                                8 * 1000 * 1000 /* 8ms */)) {
+            fprintf(stderr,
+                    "Failed to set preempt timeout for RT hardware 
context.\n");
+            intelDestroyContext(driContextPriv);
+            return false;
+         }
+
+         brw_hw_context_force_maximum_frequency(brw->bufmgr, brw->hw_ctx);
       }
    }

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to