Re: [Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful

2016-04-11 Thread Chris Wilson
On Mon, Apr 11, 2016 at 02:50:17PM +0100, Tvrtko Ursulin wrote:
> 
> On 09/04/16 10:27, Chris Wilson wrote:
> >Conceptually, each request is a record of a hardware transaction - we
> >build up a list of pending commands and then either commit them to
> >hardware, or cancel them. However, whilst building up the list of
> >pending commands, we may modify state outside of the request and make
> >references to the pending request. If we do so and then cancel that
> >request, external objects then point to the deleted request leading to
> >both graphical and memory corruption.
> >
> >The easiest example is to consider object/VMA tracking. When we mark an
> >object as active in a request, we store a pointer to this, the most
> >recent request, in the object. Then we want to free that object, we wait
> >for the most recent request to be idle before proceeding (otherwise the
> >hardware will write to pages now owned by the system, or we will attempt
> >to read from those pages before the hardware is finished writing). If
> >the request was cancelled instead, that wait completes immediately. As a
> >result, all requests must be committed and not cancelled if the external
> >state is unknown.
> 
> This was a bit hard to figure out.
> 
> So we cannot unwind because once we set last_read_req we lose the
> data on what was the previous one, before this transaction started?
> 
> Intuitively I don't like the idea of sending unfinished stuff to the
> GPU, when it failed at some random point in ring buffer preparation.

Yes, but it is not unfinished though. The mm-switch is made and
flagged as completed, etc. The request does contain instructions for the
state changes it has made so far.
 
> So I am struggling with reviewing this as I have in the previous round.
> 
> >All that remains of i915_gem_request_cancel() users are just a couple of
> >extremely unlikely allocation failures, so remove the API entirely.
> 
> This parts feels extra weird because in the non-execbuf cases we
> actually can cancel the transaction without any issues, correct?

Nope. Same problem arises in that they do not know what happens
underneath calls to e.g. object_sync. Before cancellation you have to
inspect every path now and in the future to be sure that no state was
modified inside the request.
 
> Would middle-ground be to keep the cancellations for in-kernel
> submits, and for execbuf rewind the ringbuf so only request
> post-amble is sent to the GPU?

The only place that knows whether an external observer is the request
code. I am thinking about doing the cancellation there, I just need to
check that the lockless idling will be ok with that.

> >@@ -3410,12 +3404,9 @@ int i915_gpu_idle(struct drm_device *dev)
> > return PTR_ERR(req);
> >
> > ret = i915_switch_context(req);
> >-if (ret) {
> >-i915_gem_request_cancel(req);
> >-return ret;
> >-}
> >-
> > i915_add_request_no_flush(req);
> >+if (ret)
> >+return ret;
> 
> Looks like with this it could execute the context switch on the GPU
> but not update the engine->last_context in do_switch().

Hmm, that problem is present in the current code - requests are not
unwound, just deleted. We need to rearrange switch_context after the
mi_set_context() to ensure that on the subsequent call, the context is
reloaded.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful

2016-04-11 Thread Tvrtko Ursulin


On 09/04/16 10:27, Chris Wilson wrote:

Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.


This was a bit hard to figure out.

So we cannot unwind because once we set last_read_req we lose the data 
on what was the previous one, before this transaction started?


Intuitively I don't like the idea of sending unfinished stuff to the
GPU, when it failed at some random point in ring buffer preparation.

So I am struggling with reviewing this as I have in the previous round.


All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.


This parts feels extra weird because in the non-execbuf cases we 
actually can cancel the transaction without any issues, correct?


Would middle-ground be to keep the cancellations for in-kernel submits, 
and for execbuf rewind the ringbuf so only request post-amble is sent to 
the GPU?



A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/i915/i915_drv.h|  2 --
  drivers/gpu/drm/i915/i915_gem.c| 50 --
  drivers/gpu/drm/i915/i915_gem_context.c| 21 ++---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++--
  drivers/gpu/drm/i915/intel_display.c   |  2 +-
  drivers/gpu/drm/i915/intel_lrc.c   |  4 +--
  drivers/gpu/drm/i915/intel_overlay.c   |  8 ++---
  7 files changed, 39 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a93e5dd4fa9a..f374db8de673 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2320,7 +2320,6 @@ struct drm_i915_gem_request {
  struct drm_i915_gem_request * __must_check
  i915_gem_request_alloc(struct intel_engine_cs *engine,
   struct intel_context *ctx);
-void i915_gem_request_cancel(struct drm_i915_gem_request *req);
  void i915_gem_request_free(struct kref *req_ref);
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);
@@ -2872,7 +2871,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void 
*data,
 struct drm_file *file_priv);
  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
struct drm_i915_gem_request *req);
-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params 
*params);
  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
   struct drm_i915_gem_execbuffer2 *args,
   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c3ff56594d6..42227495803f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2753,7 +2753,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 * fully prepared. Thus it can be cleaned up using the proper
 * free code.
 */
-   i915_gem_request_cancel(req);
+   intel_ring_reserved_space_cancel(req->ringbuf);
+   i915_gem_request_unreference(req);
return ret;
}

@@ -2790,13 +2791,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
return err ? ERR_PTR(err) : req;
  }

-void i915_gem_request_cancel(struct drm_i915_gem_request *req)
-{
-   intel_ring_reserved_space_cancel(req->ringbuf);
-
-   i915_gem_request_unreference(req);
-}
-
  struct drm_i915_gem_request *
  

[Intel-gfx] [PATCH 4/4] drm/i915: Late request cancellations are harmful

2016-04-09 Thread Chris Wilson
Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.

All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.

A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
Cc: Tvrtko Ursulin 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_drv.h|  2 --
 drivers/gpu/drm/i915/i915_gem.c| 50 --
 drivers/gpu/drm/i915/i915_gem_context.c| 21 ++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++--
 drivers/gpu/drm/i915/intel_display.c   |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  4 +--
 drivers/gpu/drm/i915/intel_overlay.c   |  8 ++---
 7 files changed, 39 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a93e5dd4fa9a..f374db8de673 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2320,7 +2320,6 @@ struct drm_i915_gem_request {
 struct drm_i915_gem_request * __must_check
 i915_gem_request_alloc(struct intel_engine_cs *engine,
   struct intel_context *ctx);
-void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
   struct drm_file *file);
@@ -2872,7 +2871,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void 
*data,
 struct drm_file *file_priv);
 void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
struct drm_i915_gem_request *req);
-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params 
*params);
 int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
   struct drm_i915_gem_execbuffer2 *args,
   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c3ff56594d6..42227495803f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2753,7 +2753,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 * fully prepared. Thus it can be cleaned up using the proper
 * free code.
 */
-   i915_gem_request_cancel(req);
+   intel_ring_reserved_space_cancel(req->ringbuf);
+   i915_gem_request_unreference(req);
return ret;
}
 
@@ -2790,13 +2791,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
return err ? ERR_PTR(err) : req;
 }
 
-void i915_gem_request_cancel(struct drm_i915_gem_request *req)
-{
-   intel_ring_reserved_space_cancel(req->ringbuf);
-
-   i915_gem_request_unreference(req);
-}
-
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *engine)
 {
@@ -3410,12 +3404,9 @@ int i915_gpu_idle(struct drm_device *dev)
return PTR_ERR(req);
 
ret = i915_switch_context(req);
-   if (ret) {
-   i915_gem_request_cancel(req);
-   return ret;
-   }
-
i915_add_request_no_flush(req);
+   if (ret)
+   return ret;
}
 
ret = intel_engine_idle(engine);
@@ -4917,34 +4908,33 @@ i915_gem_init_hw(struct drm_device *dev)
req = i915_gem_request_alloc(engine, NULL);