Re: [Intel-gfx] [PATCH 25/41] drm/i915: Move GEM activity tracking into a common struct reservation_object

2016-10-17 Thread Joonas Lahtinen
On pe, 2016-10-14 at 13:18 +0100, Chris Wilson wrote:
> In preparation to support many distinct timelines, we need to expand the
> activity tracking on the GEM object to handle more than just a request
> per engine. We already use the struct reservation_object on the dma-buf
> to handle many fence contexts, so integrating that into the GEM object
> itself is the preferred solution. (For example, we can now share the same
> reservation_object between every consumer/producer using this buffer and
> skip the manual import/export via dma-buf.)
> 
> v2: Reimplement busy-ioctl (by walking the reservation object), postpone
> the ABI change for another day. Similarly use the reservation object to
> find the last_write request (if active and from i915) for choosing
> display CS flips.
> 
> Caveats:
> 
>  * busy-ioctl: busy-ioctl only reports on the native fences, it will not
> warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
> being rendered to by external fences. It also will not report the same
> busy state as wait-ioctl (or polling on the dma-buf) in the same
> circumstances. On the plus side, it does retain reporting of which
> *i915* engines are engaged with this object.
> 
>  * non-blocking atomic modesets take a step backwards as the wait for
> render completion blocks the ioctl. This is fixed in a subsequent
> patch to use a fence instead for awaiting on the rendering, see
> "drm/i915: Restore nonblocking awaits for modesetting"
> 
>  * dynamic array manipulation for shared-fences in reservation is slower
> than the previous lockless static assignment (e.g. gem_exec_lut_handle
> runtime on ivb goes from 42s to 66s), mainly due to atomic operations.
> 
>  * loss of object-level retirement callbacks, emulated by VMA retirement
> tracking.
> 
>  * minor loss of object-level last activity information from debugfs,
> could be replaced with per-vma information if desired
> 
> Signed-off-by: Chris Wilson 

This was already:

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 25/41] drm/i915: Move GEM activity tracking into a common struct reservation_object

2016-10-14 Thread Chris Wilson
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

 * busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

 * non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

 * dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations.

 * loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

 * minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  15 +-
 drivers/gpu/drm/i915/i915_drv.h|  62 +++
 drivers/gpu/drm/i915/i915_gem.c| 266 -
 drivers/gpu/drm/i915/i915_gem_batch_pool.c |  11 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |  53 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.h |  45 -
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  82 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.c|  32 
 drivers/gpu/drm/i915/i915_gem_gtt.h|   1 +
 drivers/gpu/drm/i915/i915_gem_request.c|  48 +++---
 drivers/gpu/drm/i915/i915_gem_request.h|  37 +---
 drivers/gpu/drm/i915/i915_gpu_error.c  |   6 +-
 drivers/gpu/drm/i915/intel_atomic_plane.c  |   2 -
 drivers/gpu/drm/i915/intel_display.c   | 114 +++--
 drivers/gpu/drm/i915/intel_drv.h   |   3 -
 15 files changed, 223 insertions(+), 554 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_gem_dmabuf.h

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 27fd5370f0cc..3fcdfff273e4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -136,11 +136,10 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
struct i915_vma *vma;
unsigned int frontbuffer_bits;
int pin_count = 0;
-   enum intel_engine_id id;
 
lockdep_assert_held(>base.dev->struct_mutex);
 
-   seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x [ ",
+   seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x %s%s%s",
   >base,
   get_active_flag(obj),
   get_pin_flag(obj),
@@ -149,14 +148,7 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
   get_pin_mapped_flag(obj),
   obj->base.size / 1024,
   obj->base.read_domains,
-  obj->base.write_domain);
-   for_each_engine(engine, dev_priv, id)
-   seq_printf(m, "%x ",
-  i915_gem_active_get_seqno(>last_read[id],
-
>base.dev->struct_mutex));
-   seq_printf(m, "] %x %s%s%s",
-  i915_gem_active_get_seqno(>last_write,
->base.dev->struct_mutex),
+  obj->base.write_domain,
   i915_cache_level_str(dev_priv, obj->cache_level),
   obj->mm.dirty ? " dirty" : "",
   obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
@@ -187,8 +179,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
if (obj->stolen)
seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
 
-   engine = i915_gem_active_get_engine(>last_write,
-   _priv->drm.struct_mutex);
+   engine = i915_gem_object_last_write_engine(obj);
if (engine)
seq_printf(m, " (%s)", engine->name);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index