Re: [Intel-gfx] [RFC 17/44] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2014-07-23 Thread Daniel Vetter
On Wed, Jul 23, 2014 at 05:33:42PM +0100, John Harrison wrote:
 
 On 07/07/2014 20:21, Daniel Vetter wrote:
 On Wed, Jul 02, 2014 at 11:34:23AM -0700, Jesse Barnes wrote:
 On Thu, 26 Jun 2014 18:24:08 +0100
 john.c.harri...@intel.com wrote:
 
 From: John Harrison john.c.harri...@intel.com
 
 The scheduler decouples the submission of batch buffers to the driver with 
 their
 submission to the hardware. This basically means splitting the execbuffer()
 function in half. This change rearranges some code ready for the split to 
 occur.
 ---
   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   23 ---
   1 file changed, 16 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
 b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 index ec274ef..fda9187 100644
 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 @@ -32,6 +32,7 @@
   #include i915_trace.h
   #include intel_drv.h
   #include linux/dma_remapping.h
 +#include i915_scheduler.h
   #define  __EXEC_OBJECT_HAS_PIN (131)
   #define  __EXEC_OBJECT_HAS_FENCE (130)
 @@ -874,10 +875,7 @@ i915_gem_execbuffer_move_to_gpu(struct 
 intel_engine_cs *ring,
if (flush_domains  I915_GEM_DOMAIN_GTT)
wmb();
 -  /* Unconditionally invalidate gpu caches and ensure that we do flush
 -   * any residual writes from the previous batch.
 -   */
 -  return intel_ring_invalidate_all_caches(ring);
 +  return 0;
   }
   static bool
 @@ -1219,8 +1217,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
}
}
 -  intel_runtime_pm_get(dev_priv);
 -
ret = i915_mutex_lock_interruptible(dev);
if (ret)
goto pre_mutex_err;
 @@ -1331,6 +1327,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
if (ret)
goto err;
 +  i915_gem_execbuffer_move_to_active(eb-vmas, ring);
 +
 +  /* To be split into two functions here... */
 +
 +  intel_runtime_pm_get(dev_priv);
 +
 +  /* Unconditionally invalidate gpu caches and ensure that we do flush
 +   * any residual writes from the previous batch.
 +   */
 +  ret = intel_ring_invalidate_all_caches(ring);
 +  if (ret)
 +  goto err;
 +
 +  /* Switch to the correct context for the batch */
ret = i915_switch_context(ring, ctx);
if (ret)
goto err;
 @@ -1381,7 +1391,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 -  i915_gem_execbuffer_move_to_active(eb-vmas, ring);
i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
   err:
 I'd like Chris to take a look too, but it looks safe afaict.
 
 Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org
 switch_context can fail with EINTR so we really can't move stuff to the
 active list before that point. Or we need to make sure that all the stuff
 between the old and new move_to_active callsite can't fail.
 
 Or we need to track this and tell userspace with an EIO and adjusted reset
 stats that something between our point of no return where the kernel
 committed to executing the batch failed.
 
 Or we need to unrol move_to_active (which is currently not really
 possible).
 -Daniel
 
 switch_context can fail with quite a lot of different error codes. Is there
 anything particularly special about EINTR? I can't spot that particular code
 path at the moment.
 
 The context switch is done at the point of submission to the hardware. As
 batch buffers can be re-ordered between submission to driver and submission
 to hardware, there is no point choosing a context any earlier. Whereas the
 move to active needs to be done at the point of submission to the driver.
 The object needs to be marked as in use even though the batch buffer that
 actually uses it might not be executed for some time. From the software
 viewpoint, the object is in use and all the syncrhonisation code needs to
 know that.
 
 The scheduler makes the batch buffer execution asynchronous to its
 submission to the driver. There is no way to communicate back a return code
 to user land. Instead, it is up to the scheduler to check the return codes
 from all the execution paths and to retry later if something fails for a
 temporary reason. Or to discard the buffer if it is truly toast.

EINTR is simply really easy to testhit since you can provoke it with
signals. And X uses signals excessively. One point where EINTR might
happen is in intel_ring_begin, the other when we try to pin the context
into ggtt. The other error codes are true exceptions and will much harder
to hit.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 17/44] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2014-07-07 Thread Daniel Vetter
On Wed, Jul 02, 2014 at 11:34:23AM -0700, Jesse Barnes wrote:
 On Thu, 26 Jun 2014 18:24:08 +0100
 john.c.harri...@intel.com wrote:
 
  From: John Harrison john.c.harri...@intel.com
  
  The scheduler decouples the submission of batch buffers to the driver with 
  their
  submission to the hardware. This basically means splitting the execbuffer()
  function in half. This change rearranges some code ready for the split to 
  occur.
  ---
   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   23 ---
   1 file changed, 16 insertions(+), 7 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
  b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
  index ec274ef..fda9187 100644
  --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
  +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
  @@ -32,6 +32,7 @@
   #include i915_trace.h
   #include intel_drv.h
   #include linux/dma_remapping.h
  +#include i915_scheduler.h
   
   #define  __EXEC_OBJECT_HAS_PIN (131)
   #define  __EXEC_OBJECT_HAS_FENCE (130)
  @@ -874,10 +875,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs 
  *ring,
  if (flush_domains  I915_GEM_DOMAIN_GTT)
  wmb();
   
  -   /* Unconditionally invalidate gpu caches and ensure that we do flush
  -* any residual writes from the previous batch.
  -*/
  -   return intel_ring_invalidate_all_caches(ring);
  +   return 0;
   }
   
   static bool
  @@ -1219,8 +1217,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
  *data,
  }
  }
   
  -   intel_runtime_pm_get(dev_priv);
  -
  ret = i915_mutex_lock_interruptible(dev);
  if (ret)
  goto pre_mutex_err;
  @@ -1331,6 +1327,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
  *data,
  if (ret)
  goto err;
   
  +   i915_gem_execbuffer_move_to_active(eb-vmas, ring);
  +
  +   /* To be split into two functions here... */
  +
  +   intel_runtime_pm_get(dev_priv);
  +
  +   /* Unconditionally invalidate gpu caches and ensure that we do flush
  +* any residual writes from the previous batch.
  +*/
  +   ret = intel_ring_invalidate_all_caches(ring);
  +   if (ret)
  +   goto err;
  +
  +   /* Switch to the correct context for the batch */
  ret = i915_switch_context(ring, ctx);
  if (ret)
  goto err;
  @@ -1381,7 +1391,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
  *data,
   
  trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
   
  -   i915_gem_execbuffer_move_to_active(eb-vmas, ring);
  i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
   
   err:
 
 I'd like Chris to take a look too, but it looks safe afaict.
 
 Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

switch_context can fail with EINTR so we really can't move stuff to the
active list before that point. Or we need to make sure that all the stuff
between the old and new move_to_active callsite can't fail.

Or we need to track this and tell userspace with an EIO and adjusted reset
stats that something between our point of no return where the kernel
committed to executing the batch failed.

Or we need to unrol move_to_active (which is currently not really
possible).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 17/44] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2014-07-02 Thread Jesse Barnes
On Thu, 26 Jun 2014 18:24:08 +0100
john.c.harri...@intel.com wrote:

 From: John Harrison john.c.harri...@intel.com
 
 The scheduler decouples the submission of batch buffers to the driver with 
 their
 submission to the hardware. This basically means splitting the execbuffer()
 function in half. This change rearranges some code ready for the split to 
 occur.
 ---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |   23 ---
  1 file changed, 16 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
 b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 index ec274ef..fda9187 100644
 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 @@ -32,6 +32,7 @@
  #include i915_trace.h
  #include intel_drv.h
  #include linux/dma_remapping.h
 +#include i915_scheduler.h
  
  #define  __EXEC_OBJECT_HAS_PIN (131)
  #define  __EXEC_OBJECT_HAS_FENCE (130)
 @@ -874,10 +875,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs 
 *ring,
   if (flush_domains  I915_GEM_DOMAIN_GTT)
   wmb();
  
 - /* Unconditionally invalidate gpu caches and ensure that we do flush
 -  * any residual writes from the previous batch.
 -  */
 - return intel_ring_invalidate_all_caches(ring);
 + return 0;
  }
  
  static bool
 @@ -1219,8 +1217,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
   }
   }
  
 - intel_runtime_pm_get(dev_priv);
 -
   ret = i915_mutex_lock_interruptible(dev);
   if (ret)
   goto pre_mutex_err;
 @@ -1331,6 +1327,20 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
   if (ret)
   goto err;
  
 + i915_gem_execbuffer_move_to_active(eb-vmas, ring);
 +
 + /* To be split into two functions here... */
 +
 + intel_runtime_pm_get(dev_priv);
 +
 + /* Unconditionally invalidate gpu caches and ensure that we do flush
 +  * any residual writes from the previous batch.
 +  */
 + ret = intel_ring_invalidate_all_caches(ring);
 + if (ret)
 + goto err;
 +
 + /* Switch to the correct context for the batch */
   ret = i915_switch_context(ring, ctx);
   if (ret)
   goto err;
 @@ -1381,7 +1391,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
 *data,
  
   trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
  
 - i915_gem_execbuffer_move_to_active(eb-vmas, ring);
   i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
  
  err:

I'd like Chris to take a look too, but it looks safe afaict.

Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx