Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-12-05 Thread Volkin, Bradley D
On Tue, Nov 26, 2013 at 09:56:09AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote:
  On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
   On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com 
   wrote:
+static const struct drm_i915_cmd_descriptor*
+find_cmd_in_table(const struct drm_i915_cmd_table *table,
+ u32 cmd_header)
+{
+   int i;
+
+   for (i = 0; i  table-count; i++) {
+   const struct drm_i915_cmd_descriptor *desc = 
table-table[i];
+   u32 masked_cmd = desc-cmd.mask  cmd_header;
+   u32 masked_value = desc-cmd.value  desc-cmd.mask;
+
+   if (masked_cmd == masked_value)
+   return desc;
   
   Maybe pre-sort the cmd table and use bsearch? And a runtime test on
   module load to check that the table is sorted correctly.
  
  So far it doesn't look like the search is a bottleneck, but I've tried to 
  keep
  the tables sorted so that we could easily switch to bsearch if needed. Would
  you prefer to just use bsearch from the start?
 
 Yes. I think it will be easier (especially if the codes does the runtime
 check) to keep the table sorted as commands are incremently added in the
 future, rather than having to retrofit the bsearch when it becomes a
 significant problem.
 -Chris

A quick test is showing that bsearch() with the current table definitions is 
worse
than the linear search by ~7%. If I flatten the tables so there's one table with
all of the commands for a given ring, bsearch() is ~2% better than linear 
search.

So I'm inclined to add the code to check that the list is sorted but stick with 
linear
search for now. I'll revisit when we have more complete performance data.

Brad

 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread bradley . d . volkin
From: Brad Volkin bradley.d.vol...@intel.com

At this point the parser just implements the mechanics of finding
commands in the batch buffer and looking them up in the parser tables.

It rejects poorly formed batches (e.g. no MI_BATCH_BUFFER_END command,
containing unrecognized commands, etc) but does no other checks.

Optionally enabled by a module parameter, currently defaulting to off.
We'll enable by default at the end of the series.

The code to look up commands in the per-ring tables implements a linear
search. The tables are small so in practice this does not appear to cause
a performance issue. However, the tables are sorted by command opcode such
that we can move to something like binary search if this code becomes a
bottleneck in the future.

OTC-Tracker: AXIA-4631
Change-Id: If71f50d28578d27325c3438abf4b229c7cf695cd
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 148 +
 drivers/gpu/drm/i915/i915_drv.c|   5 +
 drivers/gpu/drm/i915/i915_drv.h|   4 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  15 +++
 4 files changed, 172 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 247d530..b01628e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -232,3 +232,151 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
  ring-id);
}
 }
+
+static const struct drm_i915_cmd_descriptor*
+find_cmd_in_table(const struct drm_i915_cmd_table *table,
+ u32 cmd_header)
+{
+   int i;
+
+   for (i = 0; i  table-count; i++) {
+   const struct drm_i915_cmd_descriptor *desc = table-table[i];
+   u32 masked_cmd = desc-cmd.mask  cmd_header;
+   u32 masked_value = desc-cmd.value  desc-cmd.mask;
+
+   if (masked_cmd == masked_value)
+   return desc;
+   }
+
+   return NULL;
+}
+
+/* Returns a pointer to a descriptor for the command specified by cmd_header.
+ *
+ * The caller must supply space for a default descriptor via the default_desc
+ * parameter. If no descriptor for the specified command exists in the ring's
+ * command parser tables, this function fills in default_desc based on the
+ * ring's default length encoding and returns default_desc.
+ */
+static const struct drm_i915_cmd_descriptor*
+find_cmd(struct intel_ring_buffer *ring,
+u32 cmd_header,
+struct drm_i915_cmd_descriptor *default_desc)
+{
+   u32 mask;
+   int i;
+
+   for (i = 0; i  ring-cmd_table_count; i++) {
+   const struct drm_i915_cmd_descriptor *desc;
+
+   desc = find_cmd_in_table(ring-cmd_tables[i], cmd_header);
+   if (desc)
+   return desc;
+   }
+
+   mask = ring-get_cmd_length_mask(cmd_header);
+   if (!mask)
+   return NULL;
+
+   BUG_ON(!default_desc);
+   default_desc-flags = CMD_DESC_SKIP;
+   default_desc-length.mask = mask;
+
+   return default_desc;
+}
+
+static u32 *vmap_batch(struct drm_i915_gem_object *obj)
+{
+   int i;
+   void *addr = NULL;
+   struct sg_page_iter sg_iter;
+   struct page **pages;
+
+   pages = drm_malloc_ab(obj-base.size  PAGE_SHIFT, sizeof(*pages));
+   if (pages == NULL) {
+   DRM_DEBUG(Failed to get space for pages\n);
+   goto finish;
+   }
+
+   i = 0;
+   for_each_sg_page(obj-pages-sgl, sg_iter, obj-pages-nents, 0) {
+   pages[i] = sg_page_iter_page(sg_iter);
+   i++;
+   }
+
+   addr = vmap(pages, i, 0, PAGE_KERNEL);
+   if (addr == NULL) {
+   DRM_DEBUG(Failed to vmap pages\n);
+   goto finish;
+   }
+
+finish:
+   if (pages)
+   drm_free_large(pages);
+   return (u32*)addr;
+}
+
+#define LENGTH_BIAS 2
+
+int i915_parse_cmds(struct intel_ring_buffer *ring,
+   struct drm_i915_gem_object *batch_obj,
+   u32 batch_start_offset)
+{
+   int ret = 0;
+   u32 *cmd, *batch_base, *batch_end;
+   struct drm_i915_cmd_descriptor default_desc = { 0 };
+
+   /* No command tables currently indicates a platform without parsing */
+   if (!ring-cmd_tables)
+   return 0;
+
+   batch_base = vmap_batch(batch_obj);
+   if (!batch_base) {
+   DRM_DEBUG_DRIVER(CMD: Failed to vmap batch\n);
+   return -ENOMEM;
+   }
+
+   cmd = batch_base + (batch_start_offset / sizeof(*cmd));
+   batch_end = cmd + (batch_obj-base.size / sizeof(*batch_end));
+
+   while (cmd  batch_end) {
+   const struct drm_i915_cmd_descriptor *desc;
+   u32 length;
+
+   if (*cmd == MI_BATCH_BUFFER_END)
+   break;
+
+   desc = find_cmd(ring, *cmd, 

Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Chris Wilson
On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
 +static const struct drm_i915_cmd_descriptor*
 +find_cmd_in_table(const struct drm_i915_cmd_table *table,
 +   u32 cmd_header)
 +{
 + int i;
 +
 + for (i = 0; i  table-count; i++) {
 + const struct drm_i915_cmd_descriptor *desc = table-table[i];
 + u32 masked_cmd = desc-cmd.mask  cmd_header;
 + u32 masked_value = desc-cmd.value  desc-cmd.mask;
 +
 + if (masked_cmd == masked_value)
 + return desc;

Maybe pre-sort the cmd table and use bsearch? And a runtime test on
module load to check that the table is sorted correctly.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Volkin, Bradley D
On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
  +static const struct drm_i915_cmd_descriptor*
  +find_cmd_in_table(const struct drm_i915_cmd_table *table,
  + u32 cmd_header)
  +{
  +   int i;
  +
  +   for (i = 0; i  table-count; i++) {
  +   const struct drm_i915_cmd_descriptor *desc = table-table[i];
  +   u32 masked_cmd = desc-cmd.mask  cmd_header;
  +   u32 masked_value = desc-cmd.value  desc-cmd.mask;
  +
  +   if (masked_cmd == masked_value)
  +   return desc;
 
 Maybe pre-sort the cmd table and use bsearch? And a runtime test on
 module load to check that the table is sorted correctly.

So far it doesn't look like the search is a bottleneck, but I've tried to keep
the tables sorted so that we could easily switch to bsearch if needed. Would
you prefer to just use bsearch from the start?

I agree that the module load check makes sense if we do switch.

 -Chris
 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Chris Wilson
On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote:
 On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
  On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
   +static const struct drm_i915_cmd_descriptor*
   +find_cmd_in_table(const struct drm_i915_cmd_table *table,
   +   u32 cmd_header)
   +{
   + int i;
   +
   + for (i = 0; i  table-count; i++) {
   + const struct drm_i915_cmd_descriptor *desc = table-table[i];
   + u32 masked_cmd = desc-cmd.mask  cmd_header;
   + u32 masked_value = desc-cmd.value  desc-cmd.mask;
   +
   + if (masked_cmd == masked_value)
   + return desc;
  
  Maybe pre-sort the cmd table and use bsearch? And a runtime test on
  module load to check that the table is sorted correctly.
 
 So far it doesn't look like the search is a bottleneck, but I've tried to keep
 the tables sorted so that we could easily switch to bsearch if needed. Would
 you prefer to just use bsearch from the start?

Yes. I think it will be easier (especially if the codes does the runtime
check) to keep the table sorted as commands are incremently added in the
future, rather than having to retrofit the bsearch when it becomes a
significant problem.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Volkin, Bradley D
On Tue, Nov 26, 2013 at 09:56:09AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote:
  On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
   On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com 
   wrote:
+static const struct drm_i915_cmd_descriptor*
+find_cmd_in_table(const struct drm_i915_cmd_table *table,
+ u32 cmd_header)
+{
+   int i;
+
+   for (i = 0; i  table-count; i++) {
+   const struct drm_i915_cmd_descriptor *desc = 
table-table[i];
+   u32 masked_cmd = desc-cmd.mask  cmd_header;
+   u32 masked_value = desc-cmd.value  desc-cmd.mask;
+
+   if (masked_cmd == masked_value)
+   return desc;
   
   Maybe pre-sort the cmd table and use bsearch? And a runtime test on
   module load to check that the table is sorted correctly.
  
  So far it doesn't look like the search is a bottleneck, but I've tried to 
  keep
  the tables sorted so that we could easily switch to bsearch if needed. Would
  you prefer to just use bsearch from the start?
 
 Yes. I think it will be easier (especially if the codes does the runtime
 check) to keep the table sorted as commands are incremently added in the
 future, rather than having to retrofit the bsearch when it becomes a
 significant problem.

Ok, makes sense. I'll assume the same comment applies to the register 
whitelists and
make similar changes there.

Brad

 -Chris
 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx