Re: [Intel-gfx] [PATCH v15 10/13] drm/i915/perf: execute OA configuration from command stream

2019-09-09 Thread kbuild test robot
Hi Lionel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190907-052009
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-f004-201936 (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_perf.c: In function 'i915_oa_stream_init':
>> drivers/gpu/drm/i915/i915_perf.c:2697:3: error: ignoring return value of 
>> 'i915_active_request_retire', declared with attribute warn_unused_result 
>> [-Werror=unused-result]
  i915_active_request_retire(>active_config_rq, 0,
  ^~~~
>config_mutex);
~~
   cc1: all warnings being treated as errors

vim +/i915_active_request_retire +2697 drivers/gpu/drm/i915/i915_perf.c

  2556  
  2557  /**
  2558   * i915_oa_stream_init - validate combined props for OA stream and init
  2559   * @stream: An i915 perf stream
  2560   * @param: The open parameters passed to `DRM_I915_PERF_OPEN`
  2561   * @props: The property state that configures stream (individually 
validated)
  2562   *
  2563   * While read_properties_unlocked() validates properties in isolation it
  2564   * doesn't ensure that the combination necessarily makes sense.
  2565   *
  2566   * At this point it has been determined that userspace wants a stream of
  2567   * OA metrics, but still we need to further validate the combined
  2568   * properties are OK.
  2569   *
  2570   * If the configuration makes sense then we can allocate memory for
  2571   * a circular OA buffer and apply the requested metric set 
configuration.
  2572   *
  2573   * Returns: zero on success or a negative error code.
  2574   */
  2575  static int i915_oa_stream_init(struct i915_perf_stream *stream,
  2576 struct drm_i915_perf_open_param *param,
  2577 struct perf_open_properties *props)
  2578  {
  2579  struct drm_i915_private *dev_priv = stream->dev_priv;
  2580  int format_size;
  2581  int ret;
  2582  
  2583  /* If the sysfs metrics/ directory wasn't registered for some
  2584   * reason then don't let userspace try their luck with config
  2585   * IDs
  2586   */
  2587  if (!dev_priv->perf.metrics_kobj) {
  2588  DRM_DEBUG("OA metrics weren't advertised via sysfs\n");
  2589  return -EINVAL;
  2590  }
  2591  
  2592  if (!(props->sample_flags & SAMPLE_OA_REPORT)) {
  2593  DRM_DEBUG("Only OA report sampling supported\n");
  2594  return -EINVAL;
  2595  }
  2596  
  2597  if (!dev_priv->perf.ops.enable_metric_set) {
  2598  DRM_DEBUG("OA unit not supported\n");
  2599  return -ENODEV;
  2600  }
  2601  
  2602  /* To avoid the complexity of having to accurately filter
  2603   * counter reports and marshal to the appropriate client
  2604   * we currently only allow exclusive access
  2605   */
  2606  if (dev_priv->perf.exclusive_stream) {
  2607  DRM_DEBUG("OA unit already in use\n");
  2608  return -EBUSY;
  2609  }
  2610  
  2611  if (!props->oa_format) {
  2612  DRM_DEBUG("OA report format not specified\n");
  2613  return -EINVAL;
  2614  }
  2615  
  2616  mutex_init(>config_mutex);
  2617  
  2618  stream->sample_size = sizeof(struct 
drm_i915_perf_record_header);
  2619  
  2620  format_size = dev_priv->perf.oa_formats[props->oa_format].size;
  2621  
  2622  stream->engine = props->engine;
  2623  
  2624  INIT_ACTIVE_REQUEST(>active_config_rq,
  2625  >config_mutex);
  2626  
  2627  stream->sample_flags |= SAMPLE_OA_REPORT;
  2628  stream->sample_size += format_size;
  2629  
  2630  stream->oa_buffer.format_size = format_size;
  2631  if (WARN_ON(stream->oa_buffer.format_size == 0))
  2632  return -EINVAL;
  2633  
  2634  stream->oa_buffer.format =
  2635  dev_priv->perf.oa_formats[props->oa_format].format;
  2636  
  2637  stream->periodic = props->oa_periodic;
  2638  if (stream->periodic)
  2639  stream->period_exponent = props->oa_period_exponent;
  2640  
  2641   

Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting

2019-09-09 Thread Tvrtko Ursulin


On 09/09/2019 10:23, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2019-09-09 10:19:08)


On 09/09/2019 08:12, Chris Wilson wrote:

And give up if we never even make it to the start.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
   tests/perf_pmu.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index d392a67d4..8a06e5d44 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
   while (!igt_spin_has_started(spin)) {
   unsigned long t = igt_nsec_elapsed();
   
+ igt_assert(gem_bo_busy(fd, spin->handle));

   if ((t - timeout) > 250e6) {
   timeout = t;
   igt_warn("Spinner not running after %.2fms\n",
(double)t / 1e6); > +  
igt_assert(t < 2e9);
   }
   }
   } else {
@@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
   usleep(500e3); /* Better than nothing! */
   }
   
+ igt_assert(gem_bo_busy(fd, spin->handle));

   return igt_nsec_elapsed();
   }
   



The 2s timeout for batch to start executing sounds okay.

I'd pull up and consolidate the bo_busy checks into one at the top of
the function, since it is only telling us batch has been submitted. Or
you are thinking the second check brings value in checking batch is
still executing, hasn't failed or something?


The thinking is to catch if we terminate the batch via hangcheck before
writing the dword. I think there's value in knowing if we are slow vs
dead.


Yeah as guessed then - agreed.

Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting

2019-09-09 Thread Tvrtko Ursulin


On 09/09/2019 08:12, Chris Wilson wrote:

And give up if we never even make it to the start.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  tests/perf_pmu.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index d392a67d4..8a06e5d44 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
while (!igt_spin_has_started(spin)) {
unsigned long t = igt_nsec_elapsed();
  
+			igt_assert(gem_bo_busy(fd, spin->handle));

if ((t - timeout) > 250e6) {
timeout = t;
igt_warn("Spinner not running after %.2fms\n",
 (double)t / 1e6); > + 
  igt_assert(t < 2e9);
}
}
} else {
@@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
usleep(500e3); /* Better than nothing! */
}
  
+	igt_assert(gem_bo_busy(fd, spin->handle));

return igt_nsec_elapsed();
  }
  



The 2s timeout for batch to start executing sounds okay.

I'd pull up and consolidate the bo_busy checks into one at the top of 
the function, since it is only telling us batch has been submitted. Or 
you are thinking the second check brings value in checking batch is 
still executing, hasn't failed or something?


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 04/13] drm/i915/perf: store the associated engine of a stream

2019-09-09 Thread Lionel Landwerlin
We'll use this information later to verify that a client trying to
reconfigure the stream does so on the right engine.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  | 5 +
 drivers/gpu/drm/i915/i915_perf.c | 7 +++
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 75607450ba00..274a1193d4f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1088,6 +1088,11 @@ struct i915_perf_stream {
 */
intel_wakeref_t wakeref;
 
+   /**
+* @engine: Engine associated with this performance stream.
+*/
+   struct intel_engine_cs *engine;
+
/**
 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
 * properties given when opening a stream, representing the contents
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index d18cd332afb7..9d5a3522aa35 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -363,6 +363,8 @@ struct perf_open_properties {
int oa_format;
bool oa_periodic;
int oa_period_exponent;
+
+   struct intel_engine_cs *engine;
 };
 
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
@@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
 
format_size = dev_priv->perf.oa_formats[props->oa_format].size;
 
+   stream->engine = props->engine;
+
stream->sample_flags |= SAMPLE_OA_REPORT;
stream->sample_size += format_size;
 
@@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct 
drm_i915_private *dev_priv,
return -EINVAL;
}
 
+   /* At the moment we only support using i915-perf on the RCS. */
+   props->engine = dev_priv->engine[RCS0];
+
/* Considering that ID = 0 is reserved and assuming that we don't
 * (currently) expect any configurations to ever specify duplicate
 * values for a particular property ID then the last _PROP_MAX value is
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 07/13] drm/i915/perf: allow for CS OA configs to be created lazily

2019-09-09 Thread Lionel Landwerlin
Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
Flush cpu mapping of OA config (Chris)

v3: Properly deal with the perf_metric lock (Chris/Lionel)

v4: Fix oa config unref/put when not found (Lionel)

v5: Allocate BOs for configurations on the stream instead of globally
(Lionel)

v6: Fix 64bit division (Chris)

v7: Store allocated config BOs into the stream (Lionel)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
 drivers/gpu/drm/i915/i915_drv.h  |   4 +-
 drivers/gpu/drm/i915/i915_perf.c | 270 ---
 drivers/gpu/drm/i915/i915_perf.h |  26 ++
 drivers/gpu/drm/i915/i915_perf_types.h   |  15 +-
 5 files changed, 273 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index fbad403ab7ac..b6373fbc927d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -135,6 +135,7 @@
 /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */
 #define   MI_LRI_CS_MMIO   (1<<19)
 #define   MI_LRI_FORCE_POSTED  (1<<12)
+#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
 #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1)
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT(1<<22)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f4145ae6ab6e..7eb31923cde9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1363,8 +1363,8 @@ struct drm_i915_private {
struct mutex metrics_lock;
 
/*
-* List of dynamic configurations, you need to hold
-* dev_priv->perf.metrics_lock to access it.
+* List of dynamic configurations (struct i915_oa_config), you
+* need to hold dev_priv->perf.metrics_lock to access it.
 */
struct idr metrics_idr;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 40a1ec2bc96b..c9d0de3050fb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -367,11 +367,19 @@ struct perf_open_properties {
struct intel_engine_cs *engine;
 };
 
+struct i915_oa_config_bo {
+   struct list_head link;
+
+   struct i915_oa_config *oa_config;
+   struct drm_i915_gem_object *bo;
+};
+
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
 
-static void free_oa_config(struct drm_i915_private *dev_priv,
-  struct i915_oa_config *oa_config)
+void i915_oa_config_release(struct kref *ref)
 {
+   struct i915_oa_config *oa_config = container_of(ref, 
typeof(*oa_config), ref);
+
if (!PTR_ERR(oa_config->flex_regs))
kfree(oa_config->flex_regs);
if (!PTR_ERR(oa_config->b_counter_regs))
@@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private 
*dev_priv,
kfree(oa_config);
 }
 
-static void put_oa_config(struct drm_i915_private *dev_priv,
- struct i915_oa_config *oa_config)
+static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 
n_regs)
 {
-   if (!atomic_dec_and_test(_config->ref_count))
-   return;
+   u32 i;
+
+   for (i = 0; i < n_regs; i++) {
+   if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
+   u32 n_lri = min(n_regs - i,
+   (u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
 
-   free_oa_config(dev_priv, oa_config);
+   *cs++ = MI_LOAD_REGISTER_IMM(n_lri);
+   }
+   *cs++ = i915_mmio_reg_offset(reg_data[i].addr);
+   *cs++ = reg_data[i].value;
+   }
+
+   return cs;
 }
 
-static int get_oa_config(struct drm_i915_private *dev_priv,
-int metrics_set,
-struct i915_oa_config **out_config)
+static struct i915_oa_config_bo* alloc_oa_config_buffer(struct 
drm_i915_private *i915,
+   struct i915_oa_config 
*oa_config)
 {
-   int ret;
+   struct i915_oa_config_bo *oa_bo;
+   size_t config_length = 0;
+   u32 *cs;
+   int err;
+
+   oa_bo = kzalloc(sizeof(*oa_bo), 

[Intel-gfx] [CI 02/13] drm/i915: add syncobj timeline support

2019-09-09 Thread Lionel Landwerlin
Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the chained extension is only present once (Chris)

v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)

v5: Use BIT_ULL (Chris)

v6: Fix issue with already signaled timeline points,
dma_fence_chain_find_seqno() setting fence to NULL (Chris)

v7: Report ENOENT with invalid syncobj handle (Lionel)

v8: Check for out of order timeline point insertion (Chris)

v9: After explanations on
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
drop the ordering check from v8 (Lionel)

v10: Set first extension enum item to 1 (Jason)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++
 drivers/gpu/drm/i915/i915_drv.c   |   3 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   1 +
 include/uapi/drm/i915_drm.h   |  39 +++
 4 files changed, 293 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4f5fd946ab28..46ad8d9642d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -214,6 +214,13 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
+struct i915_eb_fences {
+   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
+   struct dma_fence *dma_fence;
+   u64 value;
+   struct dma_fence_chain *chain_fence;
+};
+
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -276,6 +283,7 @@ struct i915_execbuffer {
 
struct {
u64 flags; /** Available extensions parameters */
+   struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
} extensions;
 };
 
@@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb,
 }
 
 static void
-__free_fence_array(struct drm_syncobj **fences, unsigned int n)
+__free_fence_array(struct i915_eb_fences *fences, unsigned int n)
 {
-   while (n--)
-   drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+   while (n--) {
+   drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+   dma_fence_put(fences[n].dma_fence);
+   kfree(fences[n].chain_fence);
+   }
kvfree(fences);
 }
 
-static struct drm_syncobj **
-get_fence_array(struct drm_i915_gem_execbuffer2 *args,
-   struct drm_file *file)
+static struct i915_eb_fences *
+get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences)
+{
+   struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences =
+   >extensions.timeline_fences;
+   struct drm_i915_gem_exec_fence __user *user_fences;
+   struct i915_eb_fences *fences;
+   u64 __user *user_values;
+   u64 num_fences, num_user_fences = timeline_fences->fence_count;
+   unsigned long n;
+   int err;
+
+   /* Check multiplication overflow for access_ok() and kvmalloc_array() */
+   BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+   if (num_user_fences > min_t(unsigned long,
+   ULONG_MAX / sizeof(*user_fences),
+   SIZE_MAX / sizeof(*fences)))
+   return ERR_PTR(-EINVAL);
+
+   user_fences = u64_to_user_ptr(timeline_fences->handles_ptr);
+   if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences)))
+   return ERR_PTR(-EFAULT);
+
+   user_values = u64_to_user_ptr(timeline_fences->values_ptr);
+   if (!access_ok(user_values, num_user_fences * sizeof(*user_values)))
+   return ERR_PTR(-EFAULT);
+
+   fences = kvmalloc_array(num_user_fences, sizeof(*fences),
+   __GFP_NOWARN | GFP_KERNEL);
+   if (!fences)
+   return ERR_PTR(-ENOMEM);
+
+   BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
+
+   for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) {
+   struct drm_i915_gem_exec_fence user_fence;
+   struct drm_syncobj *syncobj;
+   struct dma_fence *fence = NULL;
+   u64 point;
+
+   if (__copy_from_user(_fence, user_fences++, 
sizeof(user_fence))) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
+   err = -EINVAL;
+   goto err;
+   }
+
+   if (__get_user(point, user_values++)) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   syncobj 

[Intel-gfx] [CI 09/13] drm/i915: add wait flags to i915_active_request_retire

2019-09-09 Thread Lionel Landwerlin
An upcoming change needs not to be interrupted.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_active.c | 4 +++-
 drivers/gpu/drm/i915/i915_active.h | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 6a447f1d0110..c808c28c9464 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref)
break;
}
 
-   err = i915_active_request_retire(>base, BKL(ref));
+   err = i915_active_request_retire(>base,
+I915_WAIT_INTERRUPTIBLE,
+BKL(ref));
if (err)
break;
}
diff --git a/drivers/gpu/drm/i915/i915_active.h 
b/drivers/gpu/drm/i915/i915_active.h
index f95058f99057..35a6089b44fd 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request 
*active)
  */
 static inline int __must_check
 i915_active_request_retire(struct i915_active_request *active,
+  unsigned int flags,
   struct mutex *mutex)
 {
struct i915_request *request;
@@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request 
*active,
if (!request)
return 0;
 
-   ret = i915_request_wait(request,
-   I915_WAIT_INTERRUPTIBLE,
-   MAX_SCHEDULE_TIMEOUT);
+   ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 10/13] drm/i915/perf: execute OA configuration from command stream

2019-09-09 Thread Lionel Landwerlin
We haven't run into issues with programming the global OA/NOA
registers configuration from CPU so far, but HW engineers actually
recommend doing this from the command streamer. On TGL in particular
one of the clock domain in which some of that programming goes might
not be powered when we poke things from the CPU.

Since we have a command buffer prepared for the execbuffer side of
things, we can reuse that approach here too.

This also allows us to significantly reduce the amount of time we hold
the main lock.

v2: Drop the global lock as much as possible

v3: Take global lock to pin global

v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel)

v5: Move locking to the stream (Lionel)

v6: Move active reconfiguration request into i915_perf_stream (Lionel)

v7: Pin VMA outside request creation (Chris)
Lock VMA before move to active (Chris)

v8: Fix double free on stream->initial_oa_config_bo (Lionel)
Don't allow interruption when waiting on active config request
(Lionel)

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_perf.c   | 174 -
 drivers/gpu/drm/i915/i915_perf_types.h |  15 ++-
 2 files changed, 128 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index f2b778d84b52..abbcf3ec654c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1558,18 +1558,23 @@ free_oa_configs(struct i915_perf_stream *stream)
 static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
struct drm_i915_private *dev_priv = stream->dev_priv;
+   int err;
 
BUG_ON(stream != dev_priv->perf.exclusive_stream);
 
-   /*
-* Unset exclusive_stream first, it will be checked while disabling
-* the metric set on gen8+.
-*/
mutex_lock(_priv->drm.struct_mutex);
-   dev_priv->perf.exclusive_stream = NULL;
+   mutex_lock(>config_mutex);
dev_priv->perf.ops.disable_metric_set(stream);
+   err = i915_active_request_retire(>active_config_rq, 0,
+>config_mutex);
+   mutex_unlock(>config_mutex);
+   dev_priv->perf.exclusive_stream = NULL;
mutex_unlock(_priv->drm.struct_mutex);
 
+   if (err)
+   DRM_ERROR("Failed to disable perf stream\n");
+
+
free_oa_buffer(stream);
free_noa_wait(stream);
 
@@ -1795,6 +1800,10 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return PTR_ERR(bo);
}
 
+   ret = i915_mutex_lock_interruptible(>drm);
+   if (ret)
+   goto err_unref;
+
/*
 * We pin in GGTT because we jump into this buffer now because
 * multiple OA config BOs will have a jump to this address and it
@@ -1802,10 +1811,13 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
 */
vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0);
if (IS_ERR(vma)) {
+   mutex_unlock(>drm.struct_mutex);
ret = PTR_ERR(vma);
goto err_unref;
}
 
+   mutex_unlock(>drm.struct_mutex);
+
batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
if (IS_ERR(batch)) {
ret = PTR_ERR(batch);
@@ -1939,7 +1951,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
return 0;
 
 err_unpin:
-   __i915_vma_unpin(vma);
+   mutex_lock(>drm.struct_mutex);
+   i915_vma_unpin_and_release(, 0);
+   mutex_unlock(>drm.struct_mutex);
 
 err_unref:
i915_gem_object_put(bo);
@@ -1947,50 +1961,73 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return ret;
 }
 
-static void config_oa_regs(struct drm_i915_private *dev_priv,
-  const struct i915_oa_reg *regs,
-  u32 n_regs)
+static int emit_oa_config(struct drm_i915_private *i915,
+ struct i915_perf_stream *stream)
 {
-   u32 i;
+   struct i915_request *rq;
+   struct i915_vma *vma;
+   u32 *cs;
+   int err;
 
-   for (i = 0; i < n_regs; i++) {
-   const struct i915_oa_reg *reg = regs + i;
+   lockdep_assert_held(>config_mutex);
+
+   vma = i915_vma_instance(stream->initial_oa_config_bo,
+   >engine->gt->ggtt->vm, NULL);
+   if (unlikely(IS_ERR(vma)))
+   return PTR_ERR(vma);
+
+   err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+   if (err)
+   goto err_vma_unpin;
 
-   I915_WRITE(reg->addr, reg->value);
+   rq = i915_request_create(stream->engine->kernel_context);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto err_add_request;
}
-}
 
-static void delay_after_mux(void)
-{
-   /*
-* It apparently takes a fairly long time for a new MUX
-* configuration to be be applied after these register writes.
-

[Intel-gfx] [CI 01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Lionel Landwerlin
We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

v2: Check for invalid flags in execbuffer2 (Lionel)

v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v1)
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++-
 include/uapi/drm/i915_drm.h   | 26 +++--
 2 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 27dbcb508055..4f5fd946ab28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
+#include "i915_user_extensions.h"
 
 enum {
FORCE_CPU_RELOC = 1,
@@ -272,6 +273,10 @@ struct i915_execbuffer {
 */
int lut_size;
struct hlist_head *buckets; /** ht for relocation handles */
+
+   struct {
+   u64 flags; /** Available extensions parameters */
+   } extensions;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct 
drm_i915_gem_execbuffer2 *exec)
return false;
 
/* Kernel clipping was a DRI1 misfeature */
-   if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+   if (!(exec->flags & (I915_EXEC_FENCE_ARRAY |
+I915_EXEC_USE_EXTENSIONS))) {
if (exec->num_cliprects || exec->cliprects_ptr)
return false;
}
@@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb,
}
 }
 
+static const i915_user_extension_fn execbuf_extensions[] = {
+};
+
+static int
+parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
+ struct i915_execbuffer *eb)
+{
+   eb->extensions.flags = 0;
+
+   if (!(args->flags & I915_EXEC_USE_EXTENSIONS))
+   return 0;
+
+   /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot
+* have another flag also using it at the same time.
+*/
+   if (eb->args->flags & I915_EXEC_FENCE_ARRAY)
+   return -EINVAL;
+
+   if (args->num_cliprects != 0)
+   return -EINVAL;
+
+   return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr),
+   execbuf_extensions,
+   ARRAY_SIZE(execbuf_extensions),
+   eb);
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
   struct drm_file *file,
@@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (args->flags & I915_EXEC_IS_PINNED)
eb.batch_flags |= I915_DISPATCH_PINNED;
 
+   err = parse_execbuf2_extensions(args, );
+   if (err)
+   return err;
+
if (args->flags & I915_EXEC_FENCE_IN) {
in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
if (!in_fence)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 469dc512cca3..0a99c26730e1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence {
__u32 flags;
 };
 
+enum drm_i915_gem_execbuffer_ext {
+   DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */
+};
+
 struct drm_i915_gem_execbuffer2 {
/**
 * List of gem_exec_object2 structs
@@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 {
__u32 num_cliprects;
/**
 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-* is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
-* struct drm_i915_gem_exec_fence *fences.
+* & I915_EXEC_USE_EXTENSIONS are not set.
+*
+* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
+* of struct drm_i915_gem_exec_fence and num_cliprects is the length
+* of the array.
+*
+* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
+* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is
+* 0.
 */
__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK  (0x3f)
@@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_SUBMIT (1 << 20)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
+/*
+ * Setting I915_EXEC_USE_EXTENSIONS implies that
+ * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked
+ * list of i915_user_extension. Each i915_user_extension node is the base of a
+ * larger structure. The list of supported structures are listed in the
+ * drm_i915_gem_execbuffer_ext enum.
+ */
+#define 

[Intel-gfx] [CI 03/13] drm/i915/perf: drop list of streams

2019-09-09 Thread Lionel Landwerlin
At some point in time there was the idea that we could have multiple
stream from the same piece of HW but that never materialized and given
the hard time we already have making everything work with the
submission side, there is no real point having this list of 1 element
around.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  |  6 --
 drivers/gpu/drm/i915/i915_perf.c | 16 +---
 2 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db7480831e52..75607450ba00 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1082,11 +1082,6 @@ struct i915_perf_stream {
 */
struct drm_i915_private *dev_priv;
 
-   /**
-* @link: Links the stream into ``_i915_private->streams``
-*/
-   struct list_head link;
-
/**
 * @wakeref: As we keep the device awake while the perf stream is
 * active, we track our runtime pm reference for later release.
@@ -1671,7 +1666,6 @@ struct drm_i915_private {
 * except exclusive_stream.
 */
struct mutex lock;
-   struct list_head streams;
 
/*
 * The stream currently using the OA unit. If accessed
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index c1b764233761..d18cd332afb7 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /*
-* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct 
i915_perf_stream *stream)
if (stream->ops->destroy)
stream->ops->destroy(stream);
 
-   list_del(>link);
-
if (stream->ctx)
i915_gem_context_put(stream->ctx);
 
@@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
goto err_flags;
}
 
-   list_add(>link, _priv->perf.streams);
-
if (param->flags & I915_PERF_FLAG_FD_CLOEXEC)
f_flags |= O_CLOEXEC;
if (param->flags & I915_PERF_FLAG_FD_NONBLOCK)
@@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags);
if (stream_fd < 0) {
ret = stream_fd;
-   goto err_open;
+   goto err_flags;
}
 
if (!(param->flags & I915_PERF_FLAG_DISABLED))
@@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 
return stream_fd;
 
-err_open:
-   list_del(>link);
 err_flags:
if (stream->ops->destroy)
stream->ops->destroy(stream);
@@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
}
 
if (dev_priv->perf.ops.enable_metric_set) {
-   INIT_LIST_HEAD(_priv->perf.streams);
mutex_init(_priv->perf.lock);
 
oa_sample_rate_hard_limit = 1000 *
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi

2019-09-09 Thread Lionel Landwerlin
Reporting this version will help application figure out what level of
the support the running kernel provides.

v2: Add i915_perf_ioctl_version() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_getparam.c |  4 
 drivers/gpu/drm/i915/i915_perf.c | 10 ++
 drivers/gpu/drm/i915/i915_perf.h |  1 +
 include/uapi/drm/i915_drm.h  | 20 
 4 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index da6faa84e5b8..bd41cc5ce906 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -5,6 +5,7 @@
 #include "gt/intel_engine_user.h"
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 
 int i915_getparam_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
@@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_MMAP_GTT_COHERENT:
value = INTEL_INFO(i915)->has_coherent_ggtt;
break;
+   case I915_PARAM_PERF_REVISION:
+   value = i915_perf_ioctl_version();
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9d5a3522aa35..40a1ec2bc96b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 
dev_priv->perf.initialized = false;
 }
+
+/**
+ * i915_perf_ioctl_version - Version of the i915-perf subsystem
+ *
+ * This version number is used by userspace to detect available features.
+ */
+int i915_perf_ioctl_version(void)
+{
+   return 1;
+}
diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
index a412b16d9ffc..95549de65212 100644
--- a/drivers/gpu/drm/i915/i915_perf.h
+++ b/drivers/gpu/drm/i915/i915_perf.h
@@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915);
 void i915_perf_fini(struct drm_i915_private *i915);
 void i915_perf_register(struct drm_i915_private *i915);
 void i915_perf_unregister(struct drm_i915_private *i915);
+int i915_perf_ioctl_version(void);
 
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3d031e81648b..e98c9a7baa91 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54
 
+/*
+ * Revision of the i915-perf uAPI. The value returned helps determine what
+ * i915-perf features are available. See drm_i915_perf_property_id.
+ */
+#define I915_PARAM_PERF_REVISION   55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id {
 * Open the stream for a specific context handle (as used with
 * execbuffer2). A stream opened for a specific context this way
 * won't typically require root privileges.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_CTX_HANDLE = 1,
 
/**
 * A value of 1 requests the inclusion of raw OA unit reports as
 * part of stream samples.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_SAMPLE_OA,
 
/**
 * The value specifies which set of OA unit metrics should be
 * be configured, defining the contents of any OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_METRICS_SET,
 
/**
 * The value specifies the size and layout of OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_FORMAT,
 
@@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id {
 * from this exponent as follows:
 *
 *   80ns * 2^(period_exponent + 1)
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_EXPONENT,
 
@@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param {
  * to close and re-open a stream with the same configuration.
  *
  * It's undefined whether any pending data for the stream will be lost.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0)
 
@@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param {
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_DISABLE_IO('i', 

[Intel-gfx] [CI 06/13] drm/i915/perf: move perf types to their own header

2019-09-09 Thread Lionel Landwerlin
Following a pattern used throughout the driver.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h| 300 +-
 drivers/gpu/drm/i915/i915_perf.h   |   2 +
 drivers/gpu/drm/i915/i915_perf_types.h | 328 +
 3 files changed, 331 insertions(+), 299 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 274a1193d4f0..f4145ae6ab6e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -92,6 +92,7 @@
 #include "i915_gem_fence_reg.h"
 #include "i915_gem_gtt.h"
 #include "i915_gpu_error.h"
+#include "i915_perf_types.h"
 #include "i915_request.h"
 #include "i915_scheduler.h"
 #include "gt/intel_timeline.h"
@@ -979,305 +980,6 @@ struct intel_wm_config {
bool sprites_scaled;
 };
 
-struct i915_oa_format {
-   u32 format;
-   int size;
-};
-
-struct i915_oa_reg {
-   i915_reg_t addr;
-   u32 value;
-};
-
-struct i915_oa_config {
-   char uuid[UUID_STRING_LEN + 1];
-   int id;
-
-   const struct i915_oa_reg *mux_regs;
-   u32 mux_regs_len;
-   const struct i915_oa_reg *b_counter_regs;
-   u32 b_counter_regs_len;
-   const struct i915_oa_reg *flex_regs;
-   u32 flex_regs_len;
-
-   struct attribute_group sysfs_metric;
-   struct attribute *attrs[2];
-   struct device_attribute sysfs_metric_id;
-
-   atomic_t ref_count;
-};
-
-struct i915_perf_stream;
-
-/**
- * struct i915_perf_stream_ops - the OPs to support a specific stream type
- */
-struct i915_perf_stream_ops {
-   /**
-* @enable: Enables the collection of HW samples, either in response to
-* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
-* without `I915_PERF_FLAG_DISABLED`.
-*/
-   void (*enable)(struct i915_perf_stream *stream);
-
-   /**
-* @disable: Disables the collection of HW samples, either in response
-* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
-* the stream.
-*/
-   void (*disable)(struct i915_perf_stream *stream);
-
-   /**
-* @poll_wait: Call poll_wait, passing a wait queue that will be woken
-* once there is something ready to read() for the stream
-*/
-   void (*poll_wait)(struct i915_perf_stream *stream,
- struct file *file,
- poll_table *wait);
-
-   /**
-* @wait_unlocked: For handling a blocking read, wait until there is
-* something to ready to read() for the stream. E.g. wait on the same
-* wait queue that would be passed to poll_wait().
-*/
-   int (*wait_unlocked)(struct i915_perf_stream *stream);
-
-   /**
-* @read: Copy buffered metrics as records to userspace
-* **buf**: the userspace, destination buffer
-* **count**: the number of bytes to copy, requested by userspace
-* **offset**: zero at the start of the read, updated as the read
-* proceeds, it represents how many bytes have been copied so far and
-* the buffer offset for copying the next record.
-*
-* Copy as many buffered i915 perf samples and records for this stream
-* to userspace as will fit in the given buffer.
-*
-* Only write complete records; returning -%ENOSPC if there isn't room
-* for a complete record.
-*
-* Return any error condition that results in a short read such as
-* -%ENOSPC or -%EFAULT, even though these may be squashed before
-* returning to userspace.
-*/
-   int (*read)(struct i915_perf_stream *stream,
-   char __user *buf,
-   size_t count,
-   size_t *offset);
-
-   /**
-* @destroy: Cleanup any stream specific resources.
-*
-* The stream will always be disabled before this is called.
-*/
-   void (*destroy)(struct i915_perf_stream *stream);
-};
-
-/**
- * struct i915_perf_stream - state for a single open stream FD
- */
-struct i915_perf_stream {
-   /**
-* @dev_priv: i915 drm device
-*/
-   struct drm_i915_private *dev_priv;
-
-   /**
-* @wakeref: As we keep the device awake while the perf stream is
-* active, we track our runtime pm reference for later release.
-*/
-   intel_wakeref_t wakeref;
-
-   /**
-* @engine: Engine associated with this performance stream.
-*/
-   struct intel_engine_cs *engine;
-
-   /**
-* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
-* properties given when opening a stream, representing the contents
-* of a single sample as read() by userspace.
-*/
-   u32 sample_flags;
-
-   /**
-* @sample_size: Considering the configured contents of 

[Intel-gfx] [CI 03/13] drm/i915/perf: drop list of streams

2019-09-09 Thread Lionel Landwerlin
At some point in time there was the idea that we could have multiple
stream from the same piece of HW but that never materialized and given
the hard time we already have making everything work with the
submission side, there is no real point having this list of 1 element
around.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  |  6 --
 drivers/gpu/drm/i915/i915_perf.c | 16 +---
 2 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db7480831e52..75607450ba00 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1082,11 +1082,6 @@ struct i915_perf_stream {
 */
struct drm_i915_private *dev_priv;
 
-   /**
-* @link: Links the stream into ``_i915_private->streams``
-*/
-   struct list_head link;
-
/**
 * @wakeref: As we keep the device awake while the perf stream is
 * active, we track our runtime pm reference for later release.
@@ -1671,7 +1666,6 @@ struct drm_i915_private {
 * except exclusive_stream.
 */
struct mutex lock;
-   struct list_head streams;
 
/*
 * The stream currently using the OA unit. If accessed
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index c1b764233761..d18cd332afb7 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /*
-* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct 
i915_perf_stream *stream)
if (stream->ops->destroy)
stream->ops->destroy(stream);
 
-   list_del(>link);
-
if (stream->ctx)
i915_gem_context_put(stream->ctx);
 
@@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
goto err_flags;
}
 
-   list_add(>link, _priv->perf.streams);
-
if (param->flags & I915_PERF_FLAG_FD_CLOEXEC)
f_flags |= O_CLOEXEC;
if (param->flags & I915_PERF_FLAG_FD_NONBLOCK)
@@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags);
if (stream_fd < 0) {
ret = stream_fd;
-   goto err_open;
+   goto err_flags;
}
 
if (!(param->flags & I915_PERF_FLAG_DISABLED))
@@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 
return stream_fd;
 
-err_open:
-   list_del(>link);
 err_flags:
if (stream->ops->destroy)
stream->ops->destroy(stream);
@@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
}
 
if (dev_priv->perf.ops.enable_metric_set) {
-   INIT_LIST_HEAD(_priv->perf.streams);
mutex_init(_priv->perf.lock);
 
oa_sample_rate_hard_limit = 1000 *
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 06/13] drm/i915/perf: move perf types to their own header

2019-09-09 Thread Lionel Landwerlin
Following a pattern used throughout the driver.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h| 300 +-
 drivers/gpu/drm/i915/i915_perf.h   |   2 +
 drivers/gpu/drm/i915/i915_perf_types.h | 328 +
 3 files changed, 331 insertions(+), 299 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 274a1193d4f0..f4145ae6ab6e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -92,6 +92,7 @@
 #include "i915_gem_fence_reg.h"
 #include "i915_gem_gtt.h"
 #include "i915_gpu_error.h"
+#include "i915_perf_types.h"
 #include "i915_request.h"
 #include "i915_scheduler.h"
 #include "gt/intel_timeline.h"
@@ -979,305 +980,6 @@ struct intel_wm_config {
bool sprites_scaled;
 };
 
-struct i915_oa_format {
-   u32 format;
-   int size;
-};
-
-struct i915_oa_reg {
-   i915_reg_t addr;
-   u32 value;
-};
-
-struct i915_oa_config {
-   char uuid[UUID_STRING_LEN + 1];
-   int id;
-
-   const struct i915_oa_reg *mux_regs;
-   u32 mux_regs_len;
-   const struct i915_oa_reg *b_counter_regs;
-   u32 b_counter_regs_len;
-   const struct i915_oa_reg *flex_regs;
-   u32 flex_regs_len;
-
-   struct attribute_group sysfs_metric;
-   struct attribute *attrs[2];
-   struct device_attribute sysfs_metric_id;
-
-   atomic_t ref_count;
-};
-
-struct i915_perf_stream;
-
-/**
- * struct i915_perf_stream_ops - the OPs to support a specific stream type
- */
-struct i915_perf_stream_ops {
-   /**
-* @enable: Enables the collection of HW samples, either in response to
-* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
-* without `I915_PERF_FLAG_DISABLED`.
-*/
-   void (*enable)(struct i915_perf_stream *stream);
-
-   /**
-* @disable: Disables the collection of HW samples, either in response
-* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
-* the stream.
-*/
-   void (*disable)(struct i915_perf_stream *stream);
-
-   /**
-* @poll_wait: Call poll_wait, passing a wait queue that will be woken
-* once there is something ready to read() for the stream
-*/
-   void (*poll_wait)(struct i915_perf_stream *stream,
- struct file *file,
- poll_table *wait);
-
-   /**
-* @wait_unlocked: For handling a blocking read, wait until there is
-* something to ready to read() for the stream. E.g. wait on the same
-* wait queue that would be passed to poll_wait().
-*/
-   int (*wait_unlocked)(struct i915_perf_stream *stream);
-
-   /**
-* @read: Copy buffered metrics as records to userspace
-* **buf**: the userspace, destination buffer
-* **count**: the number of bytes to copy, requested by userspace
-* **offset**: zero at the start of the read, updated as the read
-* proceeds, it represents how many bytes have been copied so far and
-* the buffer offset for copying the next record.
-*
-* Copy as many buffered i915 perf samples and records for this stream
-* to userspace as will fit in the given buffer.
-*
-* Only write complete records; returning -%ENOSPC if there isn't room
-* for a complete record.
-*
-* Return any error condition that results in a short read such as
-* -%ENOSPC or -%EFAULT, even though these may be squashed before
-* returning to userspace.
-*/
-   int (*read)(struct i915_perf_stream *stream,
-   char __user *buf,
-   size_t count,
-   size_t *offset);
-
-   /**
-* @destroy: Cleanup any stream specific resources.
-*
-* The stream will always be disabled before this is called.
-*/
-   void (*destroy)(struct i915_perf_stream *stream);
-};
-
-/**
- * struct i915_perf_stream - state for a single open stream FD
- */
-struct i915_perf_stream {
-   /**
-* @dev_priv: i915 drm device
-*/
-   struct drm_i915_private *dev_priv;
-
-   /**
-* @wakeref: As we keep the device awake while the perf stream is
-* active, we track our runtime pm reference for later release.
-*/
-   intel_wakeref_t wakeref;
-
-   /**
-* @engine: Engine associated with this performance stream.
-*/
-   struct intel_engine_cs *engine;
-
-   /**
-* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
-* properties given when opening a stream, representing the contents
-* of a single sample as read() by userspace.
-*/
-   u32 sample_flags;
-
-   /**
-* @sample_size: Considering the configured contents of 

[Intel-gfx] [CI 11/13] drm/i915: add a new perf configuration execbuf parameter

2019-09-09 Thread Lionel Landwerlin
We want the ability to dispatch a set of command buffer to the
hardware, each with a different OA configuration. To achieve this, we
reuse a couple of fields from the execbuf2 struct (I CAN HAZ
execbuf3?) to notify what OA configuration should be used for a batch
buffer. This requires the process making the execbuf with this flag to
also own the perf fd at the time of execbuf.

v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris)
Move oa_config vma to active (Chris)

v3: Don't drop the lock for engine lookup (Chris)
Move OA config vma to active before writing the ringbuffer (Chris)

v4: Reuse i915_user_extension_fn
Serialize requests with OA config updates

v5: Check that the chained extension is only present once (Chris)
Unpin oa_vma in main path (Chris)

v6: Use BIT_ULL (Chris)

v7: Hold drm.struct_mutex when serializing the request with OA config (Chris)

v8: Remove active request from engine (Lionel)

v9: Move fetching OA configuration pass engine pinning (Lionel)
Lock VMA before moving to active (Chris)

v10: Fix leak on perf_fd (Lionel)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   4 +
 include/uapi/drm/i915_drm.h   |  39 +
 3 files changed, 188 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 46ad8d9642d1..d416b60c94bb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -24,6 +24,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_perf.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
@@ -284,7 +285,12 @@ struct i915_execbuffer {
struct {
u64 flags; /** Available extensions parameters */
struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
+   struct drm_i915_gem_execbuffer_ext_perf perf_config;
} extensions;
+
+   struct file *perf_file;
+   struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is 
not needed. */
+   struct i915_vma *oa_vma;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, 
struct i915_vma *vma)
return err;
 }
 
+
+static int
+eb_get_oa_config(struct i915_execbuffer *eb)
+{
+   struct drm_i915_gem_object *oa_bo;
+   int err = 0;
+
+   eb->perf_file = NULL;
+   eb->oa_config = NULL;
+   eb->oa_vma = NULL;
+
+   if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) 
== 0)
+   return 0;
+
+   eb->perf_file = fget(eb->extensions.perf_config.perf_fd);
+   if (!eb->perf_file)
+   return -EINVAL;
+
+   err = i915_mutex_lock_interruptible(>i915->drm);
+   if (err)
+   return err;
+
+   if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream)
+   err = -EINVAL;
+
+   mutex_unlock(>i915->drm.struct_mutex);
+
+   if (err)
+   return err;
+
+   if (eb->i915->perf.exclusive_stream->engine != eb->engine)
+   return -EINVAL;
+
+   err = i915_perf_get_oa_config_and_bo(
+   eb->i915->perf.exclusive_stream,
+   eb->extensions.perf_config.oa_config,
+   >oa_config, _bo);
+   if (err)
+   return err;
+
+   eb->oa_vma = i915_vma_instance(oa_bo,
+  >engine->gt->ggtt->vm, NULL);
+   i915_gem_object_put(oa_bo);
+   if (IS_ERR(eb->oa_vma)) {
+   err = PTR_ERR(eb->oa_vma);
+   eb->oa_vma = NULL;
+   return err;
+   }
+
+   return 0;
+}
+
 static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 struct i915_vma *vma,
 unsigned int len)
@@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file 
*file)
spin_unlock(_priv->mm.lock);
 }
 
+static int eb_oa_config(struct i915_execbuffer *eb)
+{
+   struct i915_perf_stream *perf_stream;
+   int err;
+
+   if (!eb->oa_config)
+   return 0;
+
+   perf_stream = eb->perf_file->private_data;
+
+   err = mutex_lock_interruptible(_stream->config_mutex);
+   if (err)
+   return err;
+
+   err = i915_active_request_set(_stream->active_config_rq,
+ eb->request);
+   if (err)
+   goto out;
+
+   /*
+* If the config hasn't changed, skip reconfiguring the HW (this is
+* subject to a delay we want to avoid has much as possible).
+*/
+   if (eb->oa_config == perf_stream->oa_config)
+   goto out;
+
+   i915_vma_lock(eb->oa_vma);
+   

[Intel-gfx] [CI 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi

2019-09-09 Thread Lionel Landwerlin
Reporting this version will help application figure out what level of
the support the running kernel provides.

v2: Add i915_perf_ioctl_version() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_getparam.c |  4 
 drivers/gpu/drm/i915/i915_perf.c | 10 ++
 drivers/gpu/drm/i915/i915_perf.h |  1 +
 include/uapi/drm/i915_drm.h  | 20 
 4 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index da6faa84e5b8..bd41cc5ce906 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -5,6 +5,7 @@
 #include "gt/intel_engine_user.h"
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 
 int i915_getparam_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
@@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_MMAP_GTT_COHERENT:
value = INTEL_INFO(i915)->has_coherent_ggtt;
break;
+   case I915_PARAM_PERF_REVISION:
+   value = i915_perf_ioctl_version();
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9d5a3522aa35..40a1ec2bc96b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 
dev_priv->perf.initialized = false;
 }
+
+/**
+ * i915_perf_ioctl_version - Version of the i915-perf subsystem
+ *
+ * This version number is used by userspace to detect available features.
+ */
+int i915_perf_ioctl_version(void)
+{
+   return 1;
+}
diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
index a412b16d9ffc..95549de65212 100644
--- a/drivers/gpu/drm/i915/i915_perf.h
+++ b/drivers/gpu/drm/i915/i915_perf.h
@@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915);
 void i915_perf_fini(struct drm_i915_private *i915);
 void i915_perf_register(struct drm_i915_private *i915);
 void i915_perf_unregister(struct drm_i915_private *i915);
+int i915_perf_ioctl_version(void);
 
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3d031e81648b..e98c9a7baa91 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54
 
+/*
+ * Revision of the i915-perf uAPI. The value returned helps determine what
+ * i915-perf features are available. See drm_i915_perf_property_id.
+ */
+#define I915_PARAM_PERF_REVISION   55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id {
 * Open the stream for a specific context handle (as used with
 * execbuffer2). A stream opened for a specific context this way
 * won't typically require root privileges.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_CTX_HANDLE = 1,
 
/**
 * A value of 1 requests the inclusion of raw OA unit reports as
 * part of stream samples.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_SAMPLE_OA,
 
/**
 * The value specifies which set of OA unit metrics should be
 * be configured, defining the contents of any OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_METRICS_SET,
 
/**
 * The value specifies the size and layout of OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_FORMAT,
 
@@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id {
 * from this exponent as follows:
 *
 *   80ns * 2^(period_exponent + 1)
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_EXPONENT,
 
@@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param {
  * to close and re-open a stream with the same configuration.
  *
  * It's undefined whether any pending data for the stream will be lost.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0)
 
@@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param {
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_DISABLE_IO('i', 

[Intel-gfx] [CI 02/13] drm/i915: add syncobj timeline support

2019-09-09 Thread Lionel Landwerlin
Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the chained extension is only present once (Chris)

v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)

v5: Use BIT_ULL (Chris)

v6: Fix issue with already signaled timeline points,
dma_fence_chain_find_seqno() setting fence to NULL (Chris)

v7: Report ENOENT with invalid syncobj handle (Lionel)

v8: Check for out of order timeline point insertion (Chris)

v9: After explanations on
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
drop the ordering check from v8 (Lionel)

v10: Set first extension enum item to 1 (Jason)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++
 drivers/gpu/drm/i915/i915_drv.c   |   3 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   1 +
 include/uapi/drm/i915_drm.h   |  39 +++
 4 files changed, 293 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4f5fd946ab28..46ad8d9642d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -214,6 +214,13 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
+struct i915_eb_fences {
+   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
+   struct dma_fence *dma_fence;
+   u64 value;
+   struct dma_fence_chain *chain_fence;
+};
+
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -276,6 +283,7 @@ struct i915_execbuffer {
 
struct {
u64 flags; /** Available extensions parameters */
+   struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
} extensions;
 };
 
@@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb,
 }
 
 static void
-__free_fence_array(struct drm_syncobj **fences, unsigned int n)
+__free_fence_array(struct i915_eb_fences *fences, unsigned int n)
 {
-   while (n--)
-   drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+   while (n--) {
+   drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+   dma_fence_put(fences[n].dma_fence);
+   kfree(fences[n].chain_fence);
+   }
kvfree(fences);
 }
 
-static struct drm_syncobj **
-get_fence_array(struct drm_i915_gem_execbuffer2 *args,
-   struct drm_file *file)
+static struct i915_eb_fences *
+get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences)
+{
+   struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences =
+   >extensions.timeline_fences;
+   struct drm_i915_gem_exec_fence __user *user_fences;
+   struct i915_eb_fences *fences;
+   u64 __user *user_values;
+   u64 num_fences, num_user_fences = timeline_fences->fence_count;
+   unsigned long n;
+   int err;
+
+   /* Check multiplication overflow for access_ok() and kvmalloc_array() */
+   BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+   if (num_user_fences > min_t(unsigned long,
+   ULONG_MAX / sizeof(*user_fences),
+   SIZE_MAX / sizeof(*fences)))
+   return ERR_PTR(-EINVAL);
+
+   user_fences = u64_to_user_ptr(timeline_fences->handles_ptr);
+   if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences)))
+   return ERR_PTR(-EFAULT);
+
+   user_values = u64_to_user_ptr(timeline_fences->values_ptr);
+   if (!access_ok(user_values, num_user_fences * sizeof(*user_values)))
+   return ERR_PTR(-EFAULT);
+
+   fences = kvmalloc_array(num_user_fences, sizeof(*fences),
+   __GFP_NOWARN | GFP_KERNEL);
+   if (!fences)
+   return ERR_PTR(-ENOMEM);
+
+   BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
+
+   for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) {
+   struct drm_i915_gem_exec_fence user_fence;
+   struct drm_syncobj *syncobj;
+   struct dma_fence *fence = NULL;
+   u64 point;
+
+   if (__copy_from_user(_fence, user_fences++, 
sizeof(user_fence))) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
+   err = -EINVAL;
+   goto err;
+   }
+
+   if (__get_user(point, user_values++)) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   syncobj 

[Intel-gfx] [CI 07/13] drm/i915/perf: allow for CS OA configs to be created lazily

2019-09-09 Thread Lionel Landwerlin
Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
Flush cpu mapping of OA config (Chris)

v3: Properly deal with the perf_metric lock (Chris/Lionel)

v4: Fix oa config unref/put when not found (Lionel)

v5: Allocate BOs for configurations on the stream instead of globally
(Lionel)

v6: Fix 64bit division (Chris)

v7: Store allocated config BOs into the stream (Lionel)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
 drivers/gpu/drm/i915/i915_drv.h  |   4 +-
 drivers/gpu/drm/i915/i915_perf.c | 270 ---
 drivers/gpu/drm/i915/i915_perf.h |  26 ++
 drivers/gpu/drm/i915/i915_perf_types.h   |  15 +-
 5 files changed, 273 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index fbad403ab7ac..b6373fbc927d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -135,6 +135,7 @@
 /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */
 #define   MI_LRI_CS_MMIO   (1<<19)
 #define   MI_LRI_FORCE_POSTED  (1<<12)
+#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
 #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1)
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT(1<<22)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f4145ae6ab6e..7eb31923cde9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1363,8 +1363,8 @@ struct drm_i915_private {
struct mutex metrics_lock;
 
/*
-* List of dynamic configurations, you need to hold
-* dev_priv->perf.metrics_lock to access it.
+* List of dynamic configurations (struct i915_oa_config), you
+* need to hold dev_priv->perf.metrics_lock to access it.
 */
struct idr metrics_idr;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 40a1ec2bc96b..c9d0de3050fb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -367,11 +367,19 @@ struct perf_open_properties {
struct intel_engine_cs *engine;
 };
 
+struct i915_oa_config_bo {
+   struct list_head link;
+
+   struct i915_oa_config *oa_config;
+   struct drm_i915_gem_object *bo;
+};
+
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
 
-static void free_oa_config(struct drm_i915_private *dev_priv,
-  struct i915_oa_config *oa_config)
+void i915_oa_config_release(struct kref *ref)
 {
+   struct i915_oa_config *oa_config = container_of(ref, 
typeof(*oa_config), ref);
+
if (!PTR_ERR(oa_config->flex_regs))
kfree(oa_config->flex_regs);
if (!PTR_ERR(oa_config->b_counter_regs))
@@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private 
*dev_priv,
kfree(oa_config);
 }
 
-static void put_oa_config(struct drm_i915_private *dev_priv,
- struct i915_oa_config *oa_config)
+static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 
n_regs)
 {
-   if (!atomic_dec_and_test(_config->ref_count))
-   return;
+   u32 i;
+
+   for (i = 0; i < n_regs; i++) {
+   if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
+   u32 n_lri = min(n_regs - i,
+   (u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
 
-   free_oa_config(dev_priv, oa_config);
+   *cs++ = MI_LOAD_REGISTER_IMM(n_lri);
+   }
+   *cs++ = i915_mmio_reg_offset(reg_data[i].addr);
+   *cs++ = reg_data[i].value;
+   }
+
+   return cs;
 }
 
-static int get_oa_config(struct drm_i915_private *dev_priv,
-int metrics_set,
-struct i915_oa_config **out_config)
+static struct i915_oa_config_bo* alloc_oa_config_buffer(struct 
drm_i915_private *i915,
+   struct i915_oa_config 
*oa_config)
 {
-   int ret;
+   struct i915_oa_config_bo *oa_bo;
+   size_t config_length = 0;
+   u32 *cs;
+   int err;
+
+   oa_bo = kzalloc(sizeof(*oa_bo), 

[Intel-gfx] [CI 04/13] drm/i915/perf: store the associated engine of a stream

2019-09-09 Thread Lionel Landwerlin
We'll use this information later to verify that a client trying to
reconfigure the stream does so on the right engine.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  | 5 +
 drivers/gpu/drm/i915/i915_perf.c | 7 +++
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 75607450ba00..274a1193d4f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1088,6 +1088,11 @@ struct i915_perf_stream {
 */
intel_wakeref_t wakeref;
 
+   /**
+* @engine: Engine associated with this performance stream.
+*/
+   struct intel_engine_cs *engine;
+
/**
 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
 * properties given when opening a stream, representing the contents
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index d18cd332afb7..9d5a3522aa35 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -363,6 +363,8 @@ struct perf_open_properties {
int oa_format;
bool oa_periodic;
int oa_period_exponent;
+
+   struct intel_engine_cs *engine;
 };
 
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
@@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
 
format_size = dev_priv->perf.oa_formats[props->oa_format].size;
 
+   stream->engine = props->engine;
+
stream->sample_flags |= SAMPLE_OA_REPORT;
stream->sample_size += format_size;
 
@@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct 
drm_i915_private *dev_priv,
return -EINVAL;
}
 
+   /* At the moment we only support using i915-perf on the RCS. */
+   props->engine = dev_priv->engine[RCS0];
+
/* Considering that ID = 0 is reserved and assuming that we don't
 * (currently) expect any configurations to ever specify duplicate
 * values for a particular property ID then the last _PROP_MAX value is
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 08/13] drm/i915/perf: implement active wait for noa configurations

2019-09-09 Thread Lionel Landwerlin
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).

v2: Don't forget to save/restore registers used for the wait (Chris)

v3: Name used CS_GPR registers (Chris)
Fix compile issue due to rebase (Lionel)

v4: Fix save/restore helpers (Umesh)

v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)

v6: Add missing struct declarations in i915_perf.h

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  24 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h |   5 +
 drivers/gpu/drm/i915/i915_debugfs.c  |  31 +++
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_perf.c | 234 ++-
 drivers/gpu/drm/i915/i915_perf_types.h   |   6 +
 drivers/gpu/drm/i915/i915_reg.h  |   4 +-
 7 files changed, 302 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index b6373fbc927d..e8ce44841868 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -160,6 +160,7 @@
 #define   MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */
 #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1)
 #define   MI_BATCH_RESOURCE_STREAMER (1<<10)
+#define   MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/
 
 /*
  * 3D instructions used by the kernel
@@ -238,6 +239,29 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH   (1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
+#define MI_MATH(x) MI_INSTR(0x1a, (x)-1)
+#define   MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2))
+/* operands */
+#define   MI_ALU_OP_NOOP 0
+#define   MI_ALU_OP_LOAD 128
+#define   MI_ALU_OP_LOADINV  1152
+#define   MI_ALU_OP_LOAD0129
+#define   MI_ALU_OP_LOAD11153
+#define   MI_ALU_OP_ADD  256
+#define   MI_ALU_OP_SUB  257
+#define   MI_ALU_OP_AND  258
+#define   MI_ALU_OP_OR   259
+#define   MI_ALU_OP_XOR  260
+#define   MI_ALU_OP_STORE384
+#define   MI_ALU_OP_STOREINV 1408
+/* sources */
+#define   MI_ALU_SRC_REG(x)  (x) /* 0 -> 15 */
+#define   MI_ALU_SRC_SRCA32
+#define   MI_ALU_SRC_SRCB33
+#define   MI_ALU_SRC_ACCU49
+#define   MI_ALU_SRC_ZF  50
+#define   MI_ALU_SRC_CF  51
+
 /*
  * Commands used only by the command parser
  */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index dc295c196d11..f752b6cf9ea1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -97,6 +97,11 @@ enum intel_gt_scratch_field {
/* 8 bytes */
INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256,
 
+   /* 6 * 8 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048,
+
+   /* 4 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096,
 };
 
 #endif /* __INTEL_GT_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 708855e051b5..cc17d5c2295f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3578,6 +3578,36 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops,
i915_wedged_get, i915_wedged_set,
"%llu\n");
 
+static int
+i915_perf_noa_delay_set(void *data, u64 val)
+{
+   struct drm_i915_private *i915 = data;
+
+   /* This would lead to infinite waits as we're doing timestamp
+* difference on the CS with only 32bits.
+*/
+   if (val > mul_u32_u32(U32_MAX, 
RUNTIME_INFO(i915)->cs_timestamp_frequency_khz))
+   return -EINVAL;
+
+   atomic64_set(>perf.noa_programming_delay, val);
+   return 0;
+}
+
+static int
+i915_perf_noa_delay_get(void *data, u64 *val)
+{
+   struct drm_i915_private *i915 = data;
+
+   *val = atomic64_read(>perf.noa_programming_delay);
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops,
+   i915_perf_noa_delay_get,
+   i915_perf_noa_delay_set,
+   "%llu\n");
+
+
 #define DROP_UNBOUND   BIT(0)
 #define DROP_BOUND BIT(1)
 #define DROP_RETIREBIT(2)
@@ -4354,6 +4384,7 @@ static const struct i915_debugfs_files {
const char *name;
const struct file_operations *fops;
 } i915_debugfs_files[] = {
+   {"i915_perf_noa_delay", _perf_noa_delay_fops},
{"i915_wedged", _wedged_fops},
{"i915_cache_sharing", _cache_sharing_fops},
{"i915_gem_drop_caches", _drop_caches_fops},
diff --git 

[Intel-gfx] [CI 10/13] drm/i915/perf: execute OA configuration from command stream

2019-09-09 Thread Lionel Landwerlin
We haven't run into issues with programming the global OA/NOA
registers configuration from CPU so far, but HW engineers actually
recommend doing this from the command streamer. On TGL in particular
one of the clock domain in which some of that programming goes might
not be powered when we poke things from the CPU.

Since we have a command buffer prepared for the execbuffer side of
things, we can reuse that approach here too.

This also allows us to significantly reduce the amount of time we hold
the main lock.

v2: Drop the global lock as much as possible

v3: Take global lock to pin global

v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel)

v5: Move locking to the stream (Lionel)

v6: Move active reconfiguration request into i915_perf_stream (Lionel)

v7: Pin VMA outside request creation (Chris)
Lock VMA before move to active (Chris)

v8: Fix double free on stream->initial_oa_config_bo (Lionel)
Don't allow interruption when waiting on active config request
(Lionel)

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_perf.c   | 170 -
 drivers/gpu/drm/i915/i915_perf_types.h |  15 ++-
 2 files changed, 124 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index f2b778d84b52..8e3532518139 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1558,18 +1558,23 @@ free_oa_configs(struct i915_perf_stream *stream)
 static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
struct drm_i915_private *dev_priv = stream->dev_priv;
+   int err;
 
BUG_ON(stream != dev_priv->perf.exclusive_stream);
 
-   /*
-* Unset exclusive_stream first, it will be checked while disabling
-* the metric set on gen8+.
-*/
mutex_lock(_priv->drm.struct_mutex);
-   dev_priv->perf.exclusive_stream = NULL;
+   mutex_lock(>config_mutex);
dev_priv->perf.ops.disable_metric_set(stream);
+   err = i915_active_request_retire(>active_config_rq, 0,
+>config_mutex);
+   mutex_unlock(>config_mutex);
+   dev_priv->perf.exclusive_stream = NULL;
mutex_unlock(_priv->drm.struct_mutex);
 
+   if (err)
+   DRM_ERROR("Failed to disable perf stream\n");
+
+
free_oa_buffer(stream);
free_noa_wait(stream);
 
@@ -1795,6 +1800,10 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return PTR_ERR(bo);
}
 
+   ret = i915_mutex_lock_interruptible(>drm);
+   if (ret)
+   goto err_unref;
+
/*
 * We pin in GGTT because we jump into this buffer now because
 * multiple OA config BOs will have a jump to this address and it
@@ -1802,10 +1811,13 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
 */
vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0);
if (IS_ERR(vma)) {
+   mutex_unlock(>drm.struct_mutex);
ret = PTR_ERR(vma);
goto err_unref;
}
 
+   mutex_unlock(>drm.struct_mutex);
+
batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
if (IS_ERR(batch)) {
ret = PTR_ERR(batch);
@@ -1939,7 +1951,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
return 0;
 
 err_unpin:
-   __i915_vma_unpin(vma);
+   mutex_lock(>drm.struct_mutex);
+   i915_vma_unpin_and_release(, 0);
+   mutex_unlock(>drm.struct_mutex);
 
 err_unref:
i915_gem_object_put(bo);
@@ -1947,50 +1961,73 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return ret;
 }
 
-static void config_oa_regs(struct drm_i915_private *dev_priv,
-  const struct i915_oa_reg *regs,
-  u32 n_regs)
+static int emit_oa_config(struct drm_i915_private *i915,
+ struct i915_perf_stream *stream)
 {
-   u32 i;
+   struct i915_request *rq;
+   struct i915_vma *vma;
+   u32 *cs;
+   int err;
 
-   for (i = 0; i < n_regs; i++) {
-   const struct i915_oa_reg *reg = regs + i;
+   lockdep_assert_held(>config_mutex);
+
+   vma = i915_vma_instance(stream->initial_oa_config_bo,
+   >engine->gt->ggtt->vm, NULL);
+   if (unlikely(IS_ERR(vma)))
+   return PTR_ERR(vma);
 
-   I915_WRITE(reg->addr, reg->value);
+   err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+   if (err)
+   goto err_vma_unpin;
+
+   rq = i915_request_create(stream->engine->kernel_context);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto err_add_request;
}
-}
 
-static void delay_after_mux(void)
-{
-   /*
-* It apparently takes a fairly long time for a new MUX
-* configuration to be be applied after these register writes.
-

[Intel-gfx] [CI 01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Lionel Landwerlin
We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

v2: Check for invalid flags in execbuffer2 (Lionel)

v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v1)
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++-
 include/uapi/drm/i915_drm.h   | 26 +++--
 2 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 27dbcb508055..4f5fd946ab28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
+#include "i915_user_extensions.h"
 
 enum {
FORCE_CPU_RELOC = 1,
@@ -272,6 +273,10 @@ struct i915_execbuffer {
 */
int lut_size;
struct hlist_head *buckets; /** ht for relocation handles */
+
+   struct {
+   u64 flags; /** Available extensions parameters */
+   } extensions;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct 
drm_i915_gem_execbuffer2 *exec)
return false;
 
/* Kernel clipping was a DRI1 misfeature */
-   if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+   if (!(exec->flags & (I915_EXEC_FENCE_ARRAY |
+I915_EXEC_USE_EXTENSIONS))) {
if (exec->num_cliprects || exec->cliprects_ptr)
return false;
}
@@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb,
}
 }
 
+static const i915_user_extension_fn execbuf_extensions[] = {
+};
+
+static int
+parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
+ struct i915_execbuffer *eb)
+{
+   eb->extensions.flags = 0;
+
+   if (!(args->flags & I915_EXEC_USE_EXTENSIONS))
+   return 0;
+
+   /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot
+* have another flag also using it at the same time.
+*/
+   if (eb->args->flags & I915_EXEC_FENCE_ARRAY)
+   return -EINVAL;
+
+   if (args->num_cliprects != 0)
+   return -EINVAL;
+
+   return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr),
+   execbuf_extensions,
+   ARRAY_SIZE(execbuf_extensions),
+   eb);
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
   struct drm_file *file,
@@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (args->flags & I915_EXEC_IS_PINNED)
eb.batch_flags |= I915_DISPATCH_PINNED;
 
+   err = parse_execbuf2_extensions(args, );
+   if (err)
+   return err;
+
if (args->flags & I915_EXEC_FENCE_IN) {
in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
if (!in_fence)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 469dc512cca3..0a99c26730e1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence {
__u32 flags;
 };
 
+enum drm_i915_gem_execbuffer_ext {
+   DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */
+};
+
 struct drm_i915_gem_execbuffer2 {
/**
 * List of gem_exec_object2 structs
@@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 {
__u32 num_cliprects;
/**
 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-* is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
-* struct drm_i915_gem_exec_fence *fences.
+* & I915_EXEC_USE_EXTENSIONS are not set.
+*
+* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
+* of struct drm_i915_gem_exec_fence and num_cliprects is the length
+* of the array.
+*
+* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
+* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is
+* 0.
 */
__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK  (0x3f)
@@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_SUBMIT (1 << 20)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
+/*
+ * Setting I915_EXEC_USE_EXTENSIONS implies that
+ * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked
+ * list of i915_user_extension. Each i915_user_extension node is the base of a
+ * larger structure. The list of supported structures are listed in the
+ * drm_i915_gem_execbuffer_ext enum.
+ */
+#define 

[Intel-gfx] [CI 13/13] drm/i915: add support for perf configuration queries

2019-09-09 Thread Lionel Landwerlin
Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
Fix unlocking when not locked issue (Lionel)
Add debug messages (Lionel)

v4: Fix missing unlock (Dan)

v5: Drop lock when copying config content to userspace (Chris)

v6: Drop lock when copying config list to userspace (Chris)
Fix deadlock when calling i915_perf_get_oa_config() under
perf.metrics_lock (Lionel)
Add i915_oa_config_get() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/i915_perf.c  |   3 +
 drivers/gpu/drm/i915/i915_query.c | 283 ++
 include/uapi/drm/i915_drm.h   |  65 ++-
 4 files changed, 354 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2c6f37219dff..eab42269fc5b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1368,6 +1368,12 @@ struct drm_i915_private {
 */
struct idr metrics_idr;
 
+   /*
+* Number of dynamic configurations, you need to hold
+* dev_priv->perf.metrics_lock to access it.
+*/
+   u32 n_metrics;
+
/*
 * Lock associated with anything below within this structure
 * except exclusive_stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 7adc518912bb..372cdf2e7ec8 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3915,6 +3915,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
goto sysfs_err;
}
 
+   dev_priv->perf.n_metrics++;
+
mutex_unlock(_priv->perf.metrics_lock);
 
DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
@@ -3975,6 +3977,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, 
void *data,
   _config->sysfs_metric);
 
idr_remove(_priv->perf.metrics_idr, *arg);
+   dev_priv->perf.n_metrics--;
 
mutex_unlock(_priv->perf.metrics_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index abac5042da2b..89b2821be4a0 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -7,6 +7,7 @@
 #include 
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 #include "i915_query.h"
 #include 
 
@@ -140,10 +141,292 @@ query_engine_info(struct drm_i915_private *i915,
return len;
 }
 
+static int can_copy_perf_config_registers_or_number(u32 user_n_regs,
+   u64 user_regs_ptr,
+   u32 kernel_n_regs)
+{
+   /*
+* We'll just put the number of registers, and won't copy the
+* register.
+*/
+   if (user_n_regs == 0)
+   return 0;
+
+   if (user_n_regs < kernel_n_regs)
+   return -EINVAL;
+
+   if (!access_ok(u64_to_user_ptr(user_regs_ptr),
+  2 * sizeof(u32) * kernel_n_regs))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int copy_perf_config_registers_or_number(const struct i915_oa_reg 
*kernel_regs,
+   u32 kernel_n_regs,
+   u64 user_regs_ptr,
+   u32 *user_n_regs)
+{
+   u32 r;
+
+   if (*user_n_regs == 0) {
+   *user_n_regs = kernel_n_regs;
+   return 0;
+   }
+
+   *user_n_regs = kernel_n_regs;
+
+   for (r = 0; r < kernel_n_regs; r++) {
+   u32 __user *user_reg_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2);
+   u32 __user *user_val_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 +
+   sizeof(u32));
+   int ret;
+
+   ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr),
+user_reg_ptr);
+   if (ret)
+   return -EFAULT;
+
+   ret = __put_user(kernel_regs[r].value, user_val_ptr);
+   if (ret)
+   return -EFAULT;
+   }
+
+   return 0;
+}
+
+static int query_perf_config_data(struct drm_i915_private *i915,
+ struct drm_i915_query_item *query_item,
+ bool use_uuid)

[Intel-gfx] [CI 09/13] drm/i915: add wait flags to i915_active_request_retire

2019-09-09 Thread Lionel Landwerlin
An upcoming change needs not to be interrupted.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_active.c | 4 +++-
 drivers/gpu/drm/i915/i915_active.h | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 6a447f1d0110..c808c28c9464 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref)
break;
}
 
-   err = i915_active_request_retire(>base, BKL(ref));
+   err = i915_active_request_retire(>base,
+I915_WAIT_INTERRUPTIBLE,
+BKL(ref));
if (err)
break;
}
diff --git a/drivers/gpu/drm/i915/i915_active.h 
b/drivers/gpu/drm/i915/i915_active.h
index f95058f99057..35a6089b44fd 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request 
*active)
  */
 static inline int __must_check
 i915_active_request_retire(struct i915_active_request *active,
+  unsigned int flags,
   struct mutex *mutex)
 {
struct i915_request *request;
@@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request 
*active,
if (!request)
return 0;
 
-   ret = i915_request_wait(request,
-   I915_WAIT_INTERRUPTIBLE,
-   MAX_SCHEDULE_TIMEOUT);
+   ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 12/13] drm/i915/perf: allow holding preemption on filtered ctx

2019-09-09 Thread Lionel Landwerlin
We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

v4: Just modify the port priority reporting function

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +
 drivers/gpu/drm/i915/i915_perf.c  | 31 +--
 drivers/gpu/drm/i915/i915_perf_types.h|  8 +
 include/uapi/drm/i915_drm.h   | 11 +++
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d416b60c94bb..33df58e681fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb)
if (err)
goto out;
 
+   /*
+* If the perf stream was opened with hold preemption, flag the
+* request properly so that the priority of the request is bumped once
+* it reaches the execlist ports.
+*/
+   if (eb->i915->perf.exclusive_stream->hold_preemption)
+   eb->request->flags |= I915_REQUEST_NOPREEMPT;
+
/*
 * If the config hasn't changed, skip reconfiguring the HW (this is
 * subject to a delay we want to avoid has much as possible).
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 8e3532518139..7adc518912bb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -343,6 +343,8 @@ static const struct i915_oa_format 
gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = {
  * struct perf_open_properties - for validated properties given to open a 
stream
  * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
  * @single_context: Whether a single or all gpu contexts should be monitored
+ * @hold_preemption: Whether the preemption is disabled for the filtered
+ *   context
  * @ctx_handle: A gem ctx handle for use with @single_context
  * @metrics_set: An ID for an OA unit metric set advertised via sysfs
  * @oa_format: An OA unit HW report format
@@ -357,6 +359,7 @@ struct perf_open_properties {
u32 sample_flags;
 
u64 single_context:1;
+   u64 hold_preemption:1;
u64 ctx_handle;
 
/* OA sampling state */
@@ -2632,6 +2635,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
if (WARN_ON(stream->oa_buffer.format_size == 0))
return -EINVAL;
 
+   stream->hold_preemption = props->hold_preemption;
+
stream->oa_buffer.format =
dev_priv->perf.oa_formats[props->oa_format].format;
 
@@ -3187,6 +3192,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
}
}
 
+   if (props->hold_preemption) {
+   if (!props->single_context) {
+   DRM_DEBUG("preemption disable with no context\n");
+   ret = -EINVAL;
+   goto err;
+   }
+   privileged_op = true;
+   }
+
/*
 * On Haswell the OA unit supports clock gating off for a specific
 * context and in this mode there's no visibility of metrics for the
@@ -3201,8 +3215,9 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to
 * enable the OA unit by default.
 */
-   if (IS_HASWELL(dev_priv) && specific_ctx)
+   if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) {
privileged_op = false;
+   }
 
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 * we check a 

[Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting

2019-09-09 Thread Chris Wilson
And give up if we never even make it to the start.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 tests/perf_pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index d392a67d4..8a06e5d44 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
while (!igt_spin_has_started(spin)) {
unsigned long t = igt_nsec_elapsed();
 
+   igt_assert(gem_bo_busy(fd, spin->handle));
if ((t - timeout) > 250e6) {
timeout = t;
igt_warn("Spinner not running after %.2fms\n",
 (double)t / 1e6);
+   igt_assert(t < 2e9);
}
}
} else {
@@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin)
usleep(500e3); /* Better than nothing! */
}
 
+   igt_assert(gem_bo_busy(fd, spin->handle));
return igt_nsec_elapsed();
 }
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to 
extend execbuf2
URL   : https://patchwork.freedesktop.org/series/66418/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
66b565b57b3f drm/i915: introduce a mechanism to extend execbuf2
-:141: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#141: FILE: include/uapi/drm/i915_drm.h:1165:
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_USE_EXTENSIONS<<1))
  ^

total: 0 errors, 0 warnings, 1 checks, 113 lines checked
503c88dc3bc0 drm/i915: add syncobj timeline support
-:25: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#25: 
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html

-:381: WARNING:TYPO_SPELLING: 'transfered' may be misspelled - perhaps 
'transferred'?
#381: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2616:
+* The chain's ownership is transfered to the

-:412: ERROR:CODE_INDENT: code indent should use tabs where possible
#412: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2647:
+[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,$

-:412: WARNING:LEADING_SPACE: please, no spaces at the start of a line
#412: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2647:
+[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,$

total: 1 errors, 3 warnings, 0 checks, 541 lines checked
66b65143aa4d drm/i915/perf: drop list of streams
8aca4673ec28 drm/i915/perf: store the associated engine of a stream
8db92539084e drm/i915/perf: introduce a versioning of the i915-perf uapi
8bb8be52ca97 drm/i915/perf: move perf types to their own header
-:342: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#342: 
new file mode 100644

-:347: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#347: FILE: drivers/gpu/drm/i915/i915_perf_types.h:1:
+/*

-:348: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use 
line 1 instead
#348: FILE: drivers/gpu/drm/i915/i915_perf_types.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 3 warnings, 0 checks, 648 lines checked
da1c41cf2065 drm/i915/perf: allow for CS OA configs to be created lazily
-:103: CHECK:SPACING: No space is necessary after a cast
#103: FILE: drivers/gpu/drm/i915/i915_perf.c:399:
+   (u32) MI_LOAD_REGISTER_IMM_MAX_REGS);

-:118: ERROR:POINTER_LOCATION: "foo* bar" should be "foo *bar"
#118: FILE: drivers/gpu/drm/i915/i915_perf.c:410:
+static struct i915_oa_config_bo* alloc_oa_config_buffer(struct 
drm_i915_private *i915,

total: 1 errors, 0 warnings, 1 checks, 507 lines checked
4ef9530de87e drm/i915/perf: implement active wait for noa configurations
-:43: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#43: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:242:
+#define MI_MATH(x) MI_INSTR(0x1a, (x)-1)
  ^

-:122: CHECK:LINE_SPACING: Please don't use multiple blank lines
#122: FILE: drivers/gpu/drm/i915/i915_debugfs.c:3610:
+
+

-:181: CHECK:LINE_SPACING: Please don't use multiple blank lines
#181: FILE: drivers/gpu/drm/i915/i915_perf.c:460:
 
+

-:234: CHECK:PREFER_KERNEL_TYPES: Prefer kernel type 'u32' over 'uint32_t'
#234: FILE: drivers/gpu/drm/i915/i915_perf.c:1758:
+   uint32_t d;

-:260: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#260: FILE: drivers/gpu/drm/i915/i915_perf.c:1784:
+   DIV64_U64_ROUND_UP(

-:285: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#285: FILE: drivers/gpu/drm/i915/i915_perf.c:1809:
+   batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB);

-:293: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#293: FILE: drivers/gpu/drm/i915/i915_perf.c:1817:
+   cs = save_restore_register(

-:297: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#297: FILE: drivers/gpu/drm/i915/i915_perf.c:1821:
+   cs = save_restore_register(

-:397: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#397: FILE: drivers/gpu/drm/i915/i915_perf.c:1921:
+   cs = save_restore_register(

-:401: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#401: FILE: drivers/gpu/drm/i915/i915_perf.c:1925:
+   cs = save_restore_register(

total: 0 errors, 0 warnings, 10 checks, 420 lines checked
0432b1e0d15d drm/i915: add wait flags to i915_active_request_retire
647d2458b7c3 drm/i915/perf: execute OA configuration from command stream
-:66: CHECK:LINE_SPACING: Please don't use multiple blank lines
#66: FILE: drivers/gpu/drm/i915/i915_perf.c:1577:
+
+

total: 0 errors, 0 warnings, 1 checks, 311 lines checked
8a629929451c drm/i915: add a new perf configuration execbuf parameter
-:27: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit 

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 1/2] tools/l3_parity: Unnest exit handlers

2019-09-09 Thread Petri Latvala
On Sat, Sep 07, 2019 at 07:12:56PM +0100, Chris Wilson wrote:
> The curse of using libigt where it is not wanted; in this case calling
> drop-caches while we hold the forcewake is a recipe for a long wait.
> 
> Signed-off-by: Chris Wilson 

For the series:
Reviewed-by: Petri Latvala 


> ---
>  tools/intel_l3_parity.c | 50 -
>  1 file changed, 29 insertions(+), 21 deletions(-)
> 
> diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c
> index 06a185c91..340f94b1a 100644
> --- a/tools/intel_l3_parity.c
> +++ b/tools/intel_l3_parity.c
> @@ -180,6 +180,7 @@ int main(int argc, char *argv[])
>   const char *path[REAL_MAX_SLICES] = {"l3_parity", "l3_parity_slice_1"};
>   int row = 0, bank = 0, sbank = 0;
>   int fd[REAL_MAX_SLICES] = {0}, ret, i;
> + int exitcode = EXIT_FAILURE;
>   int action = '0';
>   int daemonize = 0;
>   int device, dir;
> @@ -198,13 +199,13 @@ int main(int argc, char *argv[])
>   fd[i] = openat(dir, path[i], O_RDWR);
>   if (fd[i] < 0) {
>   if (i == 0) /* at least one slice must be supported */
> - exit(77);
> + goto skip;
>   continue;
>   }
>  
>   if (read(fd[i], l3logs[i], NUM_REGS * sizeof(uint32_t)) < 0) {
>   perror(path[i]);
> - exit(77);
> + goto skip;
>   }
>   assert(lseek(fd[i], 0, SEEK_SET) == 0);
>   }
> @@ -252,45 +253,45 @@ int main(int argc, char *argv[])
>   case '?':
>   case 'h':
>   usage(argv[0]);
> - exit(EXIT_SUCCESS);
> + goto success;
>   case 'H':
>   printf("Number of slices: %d\n", MAX_SLICES);
>   printf("Number of banks: %d\n", num_banks());
>   printf("Subbanks per bank: %d\n", NUM_SUBBANKS);
>   printf("Max L3 size: %dK\n", L3_SIZE >> 10);
>   printf("Has error injection: %s\n", 
> IS_HASWELL(devid) ? "yes" : "no");
> - exit(EXIT_SUCCESS);
> + goto success;
>   case 'r':
>   row = atoi(optarg);
>   if (row >= MAX_ROW)
> - exit(EXIT_FAILURE);
> + goto failure;
>   break;
>   case 'b':
>   bank = atoi(optarg);
>   if (bank >= num_banks() || bank >= 
> MAX_BANKS_PER_SLICE)
> - exit(EXIT_FAILURE);
> + goto failure;
>   break;
>   case 's':
>   sbank = atoi(optarg);
>   if (sbank >= NUM_SUBBANKS)
> - exit(EXIT_FAILURE);
> + goto failure;
>   break;
>   case 'w':
>   which_slice = atoi(optarg);
>   if (which_slice >= MAX_SLICES)
> - exit(EXIT_FAILURE);
> + goto failure;
>   break;
>   case 'i':
>   case 'u':
>   if (!IS_HASWELL(devid)) {
>   fprintf(stderr, "Error injection 
> supported on HSW+ only\n");
> - exit(EXIT_FAILURE);
> + goto failure;
>   }
>   case 'd':
>   if (optarg) {
>   ret = sscanf(optarg, "%d,%d,%d", , 
> , );
>   if (ret != 3)
> - exit(EXIT_FAILURE);
> + goto failure;
>   }
>   case 'a':
>   case 'l':
> @@ -298,24 +299,24 @@ int main(int argc, char *argv[])
>   case 'L':
>   if (action != '0') {
>   fprintf(stderr, "Only one action may be 
> specified\n");
> - exit(EXIT_FAILURE);
> + goto failure;
>   }
>   action = c;
>   break;
>   default:
> - abort();
> +   

[Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to 
extend execbuf2
URL   : https://patchwork.freedesktop.org/series/66406/
State : failure

== Summary ==

CALLscripts/checksyscalls.sh
  CALLscripts/atomic/check-atomics.sh
  DESCEND  objtool
  CHK include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/i915_perf.o
drivers/gpu/drm/i915/i915_perf.c: In function ‘i915_oa_stream_init’:
drivers/gpu/drm/i915/i915_perf.c:2703:3: error: ignoring return value of 
‘i915_active_request_retire’, declared with attribute warn_unused_result 
[-Werror=unused-result]
   i915_active_request_retire(>active_config_rq, 0,
   ^~~~
 >config_mutex);
 ~~
cc1: all warnings being treated as errors
scripts/Makefile.build:280: recipe for target 
'drivers/gpu/drm/i915/i915_perf.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_perf.o] Error 1
scripts/Makefile.build:497: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:497: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:497: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1083: recipe for target 'drivers' failed
make: *** [drivers] Error 2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 12/13] drm/i915/perf: allow holding preemption on filtered ctx

2019-09-09 Thread Lionel Landwerlin
We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

v4: Just modify the port priority reporting function

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +
 drivers/gpu/drm/i915/i915_perf.c  | 31 +--
 drivers/gpu/drm/i915/i915_perf_types.h|  8 +
 include/uapi/drm/i915_drm.h   | 11 +++
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d416b60c94bb..33df58e681fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb)
if (err)
goto out;
 
+   /*
+* If the perf stream was opened with hold preemption, flag the
+* request properly so that the priority of the request is bumped once
+* it reaches the execlist ports.
+*/
+   if (eb->i915->perf.exclusive_stream->hold_preemption)
+   eb->request->flags |= I915_REQUEST_NOPREEMPT;
+
/*
 * If the config hasn't changed, skip reconfiguring the HW (this is
 * subject to a delay we want to avoid has much as possible).
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index abbcf3ec654c..fd12318e7a90 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -343,6 +343,8 @@ static const struct i915_oa_format 
gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = {
  * struct perf_open_properties - for validated properties given to open a 
stream
  * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
  * @single_context: Whether a single or all gpu contexts should be monitored
+ * @hold_preemption: Whether the preemption is disabled for the filtered
+ *   context
  * @ctx_handle: A gem ctx handle for use with @single_context
  * @metrics_set: An ID for an OA unit metric set advertised via sysfs
  * @oa_format: An OA unit HW report format
@@ -357,6 +359,7 @@ struct perf_open_properties {
u32 sample_flags;
 
u64 single_context:1;
+   u64 hold_preemption:1;
u64 ctx_handle;
 
/* OA sampling state */
@@ -2632,6 +2635,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
if (WARN_ON(stream->oa_buffer.format_size == 0))
return -EINVAL;
 
+   stream->hold_preemption = props->hold_preemption;
+
stream->oa_buffer.format =
dev_priv->perf.oa_formats[props->oa_format].format;
 
@@ -3191,6 +3196,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
}
}
 
+   if (props->hold_preemption) {
+   if (!props->single_context) {
+   DRM_DEBUG("preemption disable with no context\n");
+   ret = -EINVAL;
+   goto err;
+   }
+   privileged_op = true;
+   }
+
/*
 * On Haswell the OA unit supports clock gating off for a specific
 * context and in this mode there's no visibility of metrics for the
@@ -3205,8 +3219,9 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to
 * enable the OA unit by default.
 */
-   if (IS_HASWELL(dev_priv) && specific_ctx)
+   if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) {
privileged_op = false;
+   }
 
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 * we check a 

[Intel-gfx] [CI 08/13] drm/i915/perf: implement active wait for noa configurations

2019-09-09 Thread Lionel Landwerlin
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).

v2: Don't forget to save/restore registers used for the wait (Chris)

v3: Name used CS_GPR registers (Chris)
Fix compile issue due to rebase (Lionel)

v4: Fix save/restore helpers (Umesh)

v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)

v6: Add missing struct declarations in i915_perf.h

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  24 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h |   5 +
 drivers/gpu/drm/i915/i915_debugfs.c  |  31 +++
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_perf.c | 234 ++-
 drivers/gpu/drm/i915/i915_perf_types.h   |   6 +
 drivers/gpu/drm/i915/i915_reg.h  |   4 +-
 7 files changed, 302 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index b6373fbc927d..e8ce44841868 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -160,6 +160,7 @@
 #define   MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */
 #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1)
 #define   MI_BATCH_RESOURCE_STREAMER (1<<10)
+#define   MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/
 
 /*
  * 3D instructions used by the kernel
@@ -238,6 +239,29 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH   (1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
+#define MI_MATH(x) MI_INSTR(0x1a, (x)-1)
+#define   MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2))
+/* operands */
+#define   MI_ALU_OP_NOOP 0
+#define   MI_ALU_OP_LOAD 128
+#define   MI_ALU_OP_LOADINV  1152
+#define   MI_ALU_OP_LOAD0129
+#define   MI_ALU_OP_LOAD11153
+#define   MI_ALU_OP_ADD  256
+#define   MI_ALU_OP_SUB  257
+#define   MI_ALU_OP_AND  258
+#define   MI_ALU_OP_OR   259
+#define   MI_ALU_OP_XOR  260
+#define   MI_ALU_OP_STORE384
+#define   MI_ALU_OP_STOREINV 1408
+/* sources */
+#define   MI_ALU_SRC_REG(x)  (x) /* 0 -> 15 */
+#define   MI_ALU_SRC_SRCA32
+#define   MI_ALU_SRC_SRCB33
+#define   MI_ALU_SRC_ACCU49
+#define   MI_ALU_SRC_ZF  50
+#define   MI_ALU_SRC_CF  51
+
 /*
  * Commands used only by the command parser
  */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index dc295c196d11..f752b6cf9ea1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -97,6 +97,11 @@ enum intel_gt_scratch_field {
/* 8 bytes */
INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256,
 
+   /* 6 * 8 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048,
+
+   /* 4 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096,
 };
 
 #endif /* __INTEL_GT_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 708855e051b5..cc17d5c2295f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3578,6 +3578,36 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops,
i915_wedged_get, i915_wedged_set,
"%llu\n");
 
+static int
+i915_perf_noa_delay_set(void *data, u64 val)
+{
+   struct drm_i915_private *i915 = data;
+
+   /* This would lead to infinite waits as we're doing timestamp
+* difference on the CS with only 32bits.
+*/
+   if (val > mul_u32_u32(U32_MAX, 
RUNTIME_INFO(i915)->cs_timestamp_frequency_khz))
+   return -EINVAL;
+
+   atomic64_set(>perf.noa_programming_delay, val);
+   return 0;
+}
+
+static int
+i915_perf_noa_delay_get(void *data, u64 *val)
+{
+   struct drm_i915_private *i915 = data;
+
+   *val = atomic64_read(>perf.noa_programming_delay);
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops,
+   i915_perf_noa_delay_get,
+   i915_perf_noa_delay_set,
+   "%llu\n");
+
+
 #define DROP_UNBOUND   BIT(0)
 #define DROP_BOUND BIT(1)
 #define DROP_RETIREBIT(2)
@@ -4354,6 +4384,7 @@ static const struct i915_debugfs_files {
const char *name;
const struct file_operations *fops;
 } i915_debugfs_files[] = {
+   {"i915_perf_noa_delay", _perf_noa_delay_fops},
{"i915_wedged", _wedged_fops},
{"i915_cache_sharing", _cache_sharing_fops},
{"i915_gem_drop_caches", _drop_caches_fops},
diff --git 

[Intel-gfx] [CI 11/13] drm/i915: add a new perf configuration execbuf parameter

2019-09-09 Thread Lionel Landwerlin
We want the ability to dispatch a set of command buffer to the
hardware, each with a different OA configuration. To achieve this, we
reuse a couple of fields from the execbuf2 struct (I CAN HAZ
execbuf3?) to notify what OA configuration should be used for a batch
buffer. This requires the process making the execbuf with this flag to
also own the perf fd at the time of execbuf.

v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris)
Move oa_config vma to active (Chris)

v3: Don't drop the lock for engine lookup (Chris)
Move OA config vma to active before writing the ringbuffer (Chris)

v4: Reuse i915_user_extension_fn
Serialize requests with OA config updates

v5: Check that the chained extension is only present once (Chris)
Unpin oa_vma in main path (Chris)

v6: Use BIT_ULL (Chris)

v7: Hold drm.struct_mutex when serializing the request with OA config (Chris)

v8: Remove active request from engine (Lionel)

v9: Move fetching OA configuration pass engine pinning (Lionel)
Lock VMA before moving to active (Chris)

v10: Fix leak on perf_fd (Lionel)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   4 +
 include/uapi/drm/i915_drm.h   |  39 +
 3 files changed, 188 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 46ad8d9642d1..d416b60c94bb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -24,6 +24,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_perf.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
@@ -284,7 +285,12 @@ struct i915_execbuffer {
struct {
u64 flags; /** Available extensions parameters */
struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
+   struct drm_i915_gem_execbuffer_ext_perf perf_config;
} extensions;
+
+   struct file *perf_file;
+   struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is 
not needed. */
+   struct i915_vma *oa_vma;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, 
struct i915_vma *vma)
return err;
 }
 
+
+static int
+eb_get_oa_config(struct i915_execbuffer *eb)
+{
+   struct drm_i915_gem_object *oa_bo;
+   int err = 0;
+
+   eb->perf_file = NULL;
+   eb->oa_config = NULL;
+   eb->oa_vma = NULL;
+
+   if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) 
== 0)
+   return 0;
+
+   eb->perf_file = fget(eb->extensions.perf_config.perf_fd);
+   if (!eb->perf_file)
+   return -EINVAL;
+
+   err = i915_mutex_lock_interruptible(>i915->drm);
+   if (err)
+   return err;
+
+   if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream)
+   err = -EINVAL;
+
+   mutex_unlock(>i915->drm.struct_mutex);
+
+   if (err)
+   return err;
+
+   if (eb->i915->perf.exclusive_stream->engine != eb->engine)
+   return -EINVAL;
+
+   err = i915_perf_get_oa_config_and_bo(
+   eb->i915->perf.exclusive_stream,
+   eb->extensions.perf_config.oa_config,
+   >oa_config, _bo);
+   if (err)
+   return err;
+
+   eb->oa_vma = i915_vma_instance(oa_bo,
+  >engine->gt->ggtt->vm, NULL);
+   i915_gem_object_put(oa_bo);
+   if (IS_ERR(eb->oa_vma)) {
+   err = PTR_ERR(eb->oa_vma);
+   eb->oa_vma = NULL;
+   return err;
+   }
+
+   return 0;
+}
+
 static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 struct i915_vma *vma,
 unsigned int len)
@@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file 
*file)
spin_unlock(_priv->mm.lock);
 }
 
+static int eb_oa_config(struct i915_execbuffer *eb)
+{
+   struct i915_perf_stream *perf_stream;
+   int err;
+
+   if (!eb->oa_config)
+   return 0;
+
+   perf_stream = eb->perf_file->private_data;
+
+   err = mutex_lock_interruptible(_stream->config_mutex);
+   if (err)
+   return err;
+
+   err = i915_active_request_set(_stream->active_config_rq,
+ eb->request);
+   if (err)
+   goto out;
+
+   /*
+* If the config hasn't changed, skip reconfiguring the HW (this is
+* subject to a delay we want to avoid has much as possible).
+*/
+   if (eb->oa_config == perf_stream->oa_config)
+   goto out;
+
+   i915_vma_lock(eb->oa_vma);
+   

[Intel-gfx] [CI 13/13] drm/i915: add support for perf configuration queries

2019-09-09 Thread Lionel Landwerlin
Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
Fix unlocking when not locked issue (Lionel)
Add debug messages (Lionel)

v4: Fix missing unlock (Dan)

v5: Drop lock when copying config content to userspace (Chris)

v6: Drop lock when copying config list to userspace (Chris)
Fix deadlock when calling i915_perf_get_oa_config() under
perf.metrics_lock (Lionel)
Add i915_oa_config_get() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/i915_perf.c  |   3 +
 drivers/gpu/drm/i915/i915_query.c | 283 ++
 include/uapi/drm/i915_drm.h   |  65 ++-
 4 files changed, 354 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2c6f37219dff..eab42269fc5b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1368,6 +1368,12 @@ struct drm_i915_private {
 */
struct idr metrics_idr;
 
+   /*
+* Number of dynamic configurations, you need to hold
+* dev_priv->perf.metrics_lock to access it.
+*/
+   u32 n_metrics;
+
/*
 * Lock associated with anything below within this structure
 * except exclusive_stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index fd12318e7a90..40a02838b68c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3919,6 +3919,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
goto sysfs_err;
}
 
+   dev_priv->perf.n_metrics++;
+
mutex_unlock(_priv->perf.metrics_lock);
 
DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
@@ -3979,6 +3981,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, 
void *data,
   _config->sysfs_metric);
 
idr_remove(_priv->perf.metrics_idr, *arg);
+   dev_priv->perf.n_metrics--;
 
mutex_unlock(_priv->perf.metrics_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index abac5042da2b..89b2821be4a0 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -7,6 +7,7 @@
 #include 
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 #include "i915_query.h"
 #include 
 
@@ -140,10 +141,292 @@ query_engine_info(struct drm_i915_private *i915,
return len;
 }
 
+static int can_copy_perf_config_registers_or_number(u32 user_n_regs,
+   u64 user_regs_ptr,
+   u32 kernel_n_regs)
+{
+   /*
+* We'll just put the number of registers, and won't copy the
+* register.
+*/
+   if (user_n_regs == 0)
+   return 0;
+
+   if (user_n_regs < kernel_n_regs)
+   return -EINVAL;
+
+   if (!access_ok(u64_to_user_ptr(user_regs_ptr),
+  2 * sizeof(u32) * kernel_n_regs))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int copy_perf_config_registers_or_number(const struct i915_oa_reg 
*kernel_regs,
+   u32 kernel_n_regs,
+   u64 user_regs_ptr,
+   u32 *user_n_regs)
+{
+   u32 r;
+
+   if (*user_n_regs == 0) {
+   *user_n_regs = kernel_n_regs;
+   return 0;
+   }
+
+   *user_n_regs = kernel_n_regs;
+
+   for (r = 0; r < kernel_n_regs; r++) {
+   u32 __user *user_reg_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2);
+   u32 __user *user_val_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 +
+   sizeof(u32));
+   int ret;
+
+   ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr),
+user_reg_ptr);
+   if (ret)
+   return -EFAULT;
+
+   ret = __put_user(kernel_regs[r].value, user_val_ptr);
+   if (ret)
+   return -EFAULT;
+   }
+
+   return 0;
+}
+
+static int query_perf_config_data(struct drm_i915_private *i915,
+ struct drm_i915_query_item *query_item,
+ bool use_uuid)

Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting

2019-09-09 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-09-09 10:19:08)
> 
> On 09/09/2019 08:12, Chris Wilson wrote:
> > And give up if we never even make it to the start.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592
> > Signed-off-by: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > ---
> >   tests/perf_pmu.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> > index d392a67d4..8a06e5d44 100644
> > --- a/tests/perf_pmu.c
> > +++ b/tests/perf_pmu.c
> > @@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t 
> > *spin)
> >   while (!igt_spin_has_started(spin)) {
> >   unsigned long t = igt_nsec_elapsed();
> >   
> > + igt_assert(gem_bo_busy(fd, spin->handle));
> >   if ((t - timeout) > 250e6) {
> >   timeout = t;
> >   igt_warn("Spinner not running after %.2fms\n",
> >(double)t / 1e6); > +
> >   igt_assert(t < 2e9);
> >   }
> >   }
> >   } else {
> > @@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t 
> > *spin)
> >   usleep(500e3); /* Better than nothing! */
> >   }
> >   
> > + igt_assert(gem_bo_busy(fd, spin->handle));
> >   return igt_nsec_elapsed();
> >   }
> >   
> > 
> 
> The 2s timeout for batch to start executing sounds okay.
> 
> I'd pull up and consolidate the bo_busy checks into one at the top of 
> the function, since it is only telling us batch has been submitted. Or 
> you are thinking the second check brings value in checking batch is 
> still executing, hasn't failed or something?

The thinking is to catch if we terminate the batch via hangcheck before
writing the dword. I think there's value in knowing if we are slow vs
dead.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to 
extend execbuf2
URL   : https://patchwork.freedesktop.org/series/66418/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.6.0
Commit: drm/i915: introduce a mechanism to extend execbuf2
Okay!

Commit: drm/i915: add syncobj timeline support
Okay!

Commit: drm/i915/perf: drop list of streams
+drivers/gpu/drm/i915/i915_perf.c:1436:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1492:15: warning: memset with byte count of 
16777216
-O:drivers/gpu/drm/i915/i915_perf.c:1436:15: warning: memset with byte count of 
16777216
-O:drivers/gpu/drm/i915/i915_perf.c:1495:15: warning: memset with byte count of 
16777216

Commit: drm/i915/perf: store the associated engine of a stream
Okay!

Commit: drm/i915/perf: introduce a versioning of the i915-perf uapi
Okay!

Commit: drm/i915/perf: move perf types to their own header
Okay!

Commit: drm/i915/perf: allow for CS OA configs to be created lazily
Okay!

Commit: drm/i915/perf: implement active wait for noa configurations
Okay!

Commit: drm/i915: add wait flags to i915_active_request_retire
Okay!

Commit: drm/i915/perf: execute OA configuration from command stream
Okay!

Commit: drm/i915: add a new perf configuration execbuf parameter
Okay!

Commit: drm/i915/perf: allow holding preemption on filtered ctx
Okay!

Commit: drm/i915: add support for perf configuration queries
Okay!

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/6] drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel

2019-09-09 Thread Chris Wilson
Being a "low-level" test, we opt to bypass the normal bind/unbind hooks
for the lower level insert_entries/clear_range. For ggtt, the
bind/unbind hooks provide the runtime wakeref and so we must also handle
this in exercising the low level hooks.

<4> [538.151672] RPM raw-wakeref not held
<4> [538.151825] WARNING: CPU: 0 PID: 11 at 
./drivers/gpu/drm/i915/intel_runtime_pm.h:107 fwtable_read32+0x1be/0x300 [i915]
<4> [538.151830] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp btusb 
btrtl btbcm x86_pkg_temp_thermal coretemp btintel crct10dif_pclmul bluetooth 
crc32_pclmul snd_intel_nhlt snd_hda_codec ecdh_generic ghash_clmulni_intel ecc 
snd_hwdep snd_hda_core lpc_ich r8169 realtek snd_pcm mei_me mei prime_numbers 
pinctrl_broxton pinctrl_intel [last unloaded: i915]
<4> [538.151861] CPU: 0 PID: 11 Comm: migration/0 Tainted: G U
5.3.0-rc7-CI-Trybot_4938+ #1
<4> [538.151864] Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS 
AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018
<4> [538.151960] RIP: 0010:fwtable_read32+0x1be/0x300 [i915]
<4> [538.151965] Code: e8 e7 f9 5f e0 e9 0b ff ff ff 80 3d d5 8d 26 00 00 0f 85 
81 fe ff ff 48 c7 c7 ef 01 bd a0 c6 05 c1 8d 26 00 01 e8 b2 e4 6a e0 <0f> 0b e9 
67 fe ff ff 80 3d ad 8d 26 00 00 0f 85 65 fe ff ff 48 c7
<4> [538.151969] RSP: 0018:c907be10 EFLAGS: 00010086
<4> [538.151972] RAX:  RBX: 88826be10d50 RCX: 
0002
<4> [538.151975] RDX: 8002 RSI:  RDI: 

<4> [538.151978] RBP:  R08:  R09: 

<4> [538.151981] R10:  R11: c907bcb0 R12: 
00101008
<4> [538.151984] R13:  R14: c936f638 R15: 
0002
<4> [538.151987] FS:  () GS:888277a0() 
knlGS:
<4> [538.151990] CS:  0010 DS:  ES:  CR0: 80050033
<4> [538.151993] CR2: 7fd48e7052f8 CR3: 0521 CR4: 
003406f0
<4> [538.151995] Call Trace:
<4> [538.152106]  bxt_vtd_ggtt_clear_range__cb+0x38/0x40 [i915]

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 31a51ca1ddcb..598c18d10640 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -293,18 +293,20 @@ static int lowlevel_hole(struct drm_i915_private *i915,
mock_vma.node.size = BIT_ULL(size);
mock_vma.node.start = addr;
 
-   wakeref = intel_runtime_pm_get(>runtime_pm);
-   vm->insert_entries(vm, _vma, I915_CACHE_NONE, 0);
-   intel_runtime_pm_put(>runtime_pm, wakeref);
+   with_intel_runtime_pm(>runtime_pm, wakeref)
+   vm->insert_entries(vm, _vma,
+  I915_CACHE_NONE, 0);
}
count = n;
 
i915_random_reorder(order, count, );
for (n = 0; n < count; n++) {
u64 addr = hole_start + order[n] * BIT_ULL(size);
+   intel_wakeref_t wakeref;
 
GEM_BUG_ON(addr + BIT_ULL(size) > vm->total);
-   vm->clear_range(vm, addr, BIT_ULL(size));
+   with_intel_runtime_pm(>runtime_pm, wakeref)
+   vm->clear_range(vm, addr, BIT_ULL(size));
}
 
i915_gem_object_unpin_pages(obj);
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] Enable iommu on gfx by default

2019-09-09 Thread Chris Wilson
Other than Broadwell being fubar (and Ironlake + g4x being special in
their own way), there appears to be little fallout from enabling iommu.
(The biggest open question is over performance, TLB misses are much more
expensive and that impacts meda/CL/GL throughput.) Enabling iommu/dmar
makes our CI much more powerful, instead of a random GPU write causing
memcorruption which may or may not impact the system, we get a DMAR
fault. So once and for all we will be able to ascertain whether those
sporadic memcorruption are truly our fault.
-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/6] drm/i915/selftests: Tighten the timeout testing for partial mmaps

2019-09-09 Thread Chris Wilson
Currently, if there is time remaining before the start of the loop, we
do one full iteration over many possible different chunks within the
object. A full loop may take 50+s (depending on speed of indirect GTT
mmapings) and we try separately with LINEAR, X and Y -- at which point
igt times out. If we check more frequently, we will interrupt the loop
upon our timeout -- it is hard to argue that significantly reduces the
test coverage despite the dramatic contraction in runtime. In practical
terms, the coverage we should prioritise is using different fence
setups, forcing verification of the tile row computations over the
current preference of checking extracting chunks. Though the exhaustive
search is great given an infinite timeout, to improve our current
coverage, we also add a randomised smoketest of partial mmaps.

Signed-off-by: Chris Wilson 
---
 .../drm/i915/gem/selftests/i915_gem_mman.c| 253 +++---
 1 file changed, 222 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 1d27babff0ce..685726c85991 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -10,6 +10,7 @@
 #include "gt/intel_gt_pm.h"
 #include "huge_gem_object.h"
 #include "i915_selftest.h"
+#include "selftests/i915_random.h"
 #include "selftests/igt_flush_test.h"
 
 struct tile {
@@ -75,6 +76,96 @@ static u64 tiled_offset(const struct tile *tile, u64 v)
 }
 
 static int check_partial_mapping(struct drm_i915_gem_object *obj,
+const struct tile *tile,
+struct rnd_state *prng)
+{
+   const unsigned long npages = obj->base.size / PAGE_SIZE;
+   struct i915_ggtt_view view;
+   struct i915_vma *vma;
+   unsigned long page;
+   u32 __iomem *io;
+   struct page *p;
+   unsigned int n;
+   u64 offset;
+   u32 *cpu;
+   int err;
+
+   err = i915_gem_object_set_tiling(obj, tile->tiling, tile->stride);
+   if (err) {
+   pr_err("Failed to set tiling mode=%u, stride=%u, err=%d\n",
+  tile->tiling, tile->stride, err);
+   return err;
+   }
+
+   GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling);
+   GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride);
+
+   i915_gem_object_lock(obj);
+   err = i915_gem_object_set_to_gtt_domain(obj, true);
+   i915_gem_object_unlock(obj);
+   if (err) {
+   pr_err("Failed to flush to GTT write domain; err=%d\n", err);
+   return err;
+   }
+
+   page = i915_prandom_u32_max_state(npages, prng);
+   view = compute_partial_view(obj, page, MIN_CHUNK_PAGES);
+
+   vma = i915_gem_object_ggtt_pin(obj, , 0, 0, PIN_MAPPABLE);
+   if (IS_ERR(vma)) {
+   pr_err("Failed to pin partial view: offset=%lu; err=%d\n",
+  page, (int)PTR_ERR(vma));
+   return PTR_ERR(vma);
+   }
+
+   n = page - view.partial.offset;
+   GEM_BUG_ON(n >= view.partial.size);
+
+   io = i915_vma_pin_iomap(vma);
+   i915_vma_unpin(vma);
+   if (IS_ERR(io)) {
+   pr_err("Failed to iomap partial view: offset=%lu; err=%d\n",
+  page, (int)PTR_ERR(io));
+   err = PTR_ERR(io);
+   goto out;
+   }
+
+   iowrite32(page, io + n * PAGE_SIZE / sizeof(*io));
+   i915_vma_unpin_iomap(vma);
+
+   offset = tiled_offset(tile, page << PAGE_SHIFT);
+   if (offset >= obj->base.size)
+   goto out;
+
+   intel_gt_flush_ggtt_writes(_i915(obj->base.dev)->gt);
+
+   p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
+   cpu = kmap(p) + offset_in_page(offset);
+   drm_clflush_virt_range(cpu, sizeof(*cpu));
+   if (*cpu != (u32)page) {
+   pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, 
row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to 
page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n",
+  page, n,
+  view.partial.offset,
+  view.partial.size,
+  vma->size >> PAGE_SHIFT,
+  tile->tiling ? tile_row_pages(obj) : 0,
+  vma->fence ? vma->fence->id : -1, tile->tiling, 
tile->stride,
+  offset >> PAGE_SHIFT,
+  (unsigned int)offset_in_page(offset),
+  offset,
+  (u32)page, *cpu);
+   err = -EINVAL;
+   }
+   *cpu = 0;
+   drm_clflush_virt_range(cpu, sizeof(*cpu));
+   kunmap(p);
+
+out:
+   i915_vma_destroy(vma);
+   return err;
+}
+
+static int check_partial_mappings(struct drm_i915_gem_object *obj,
 const struct tile *tile,
 unsigned 

[Intel-gfx] [PATCH 4/6] drm/i915: Force compilation with intel-iommu for CI validation

2019-09-09 Thread Chris Wilson
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig.debug | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 00786a142ff0..ebcb6dbc2393 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -20,6 +20,8 @@ config DRM_I915_WERROR
 config DRM_I915_DEBUG
 bool "Enable additional driver debugging"
 depends on DRM_I915
+   select PCI_MSI
+   select INTEL_IOMMU
 select DEBUG_FS
 select PREEMPT_COUNT
 select REFCOUNT_FULL
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 5/6] iommu/intel: Declare Broadwell igfx dmar support snafu

2019-09-09 Thread Chris Wilson
Despite the widespread and complete failure of Broadwell integrated
graphics when DMAR is enabled, known over the years, we have never been
able to root cause the issue. Instead, we let the failure undermine our
confidence in the iommu system itself when we should be pushing for it to
be always enabled. Quirk away Broadwell and remove the rotten apple.

References: https://bugs.freedesktop.org/show_bug.cgi?id=89360
Signed-off-by: Chris Wilson 
Cc: Lu Baolu 
Cc: Martin Peres 
Cc: Joerg Roedel 
---
 drivers/iommu/intel-iommu.c | 44 +
 1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c4e0e4a9ee9e..34f6a3d93ae2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5690,20 +5690,46 @@ const struct iommu_ops intel_iommu_ops = {
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
 
-static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
+static void quirk_iommu_igfx(struct pci_dev *dev)
 {
-   /* G4x/GM45 integrated gfx dmar support is totally busted. */
pci_info(dev, "Disabling IOMMU for graphics on this chipset\n");
dmar_map_gfx = 0;
 }
 
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e10, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e20, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e30, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_g4x_gfx);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_g4x_gfx);
+/* G4x/GM45 integrated gfx dmar support is totally busted. */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e10, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e20, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e30, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_igfx);
+
+/* Broadwell igfx malfunctions with dmar */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1606, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160B, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160E, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1602, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160A, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160D, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1616, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161B, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161E, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1612, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161A, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161D, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1626, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162B, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162E, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1622, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162A, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162D, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1636, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163B, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163E, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1632, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163A, quirk_iommu_igfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163D, quirk_iommu_igfx);
 
 static void quirk_iommu_rwbf(struct pci_dev *dev)
 {
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/6] drm/i915: Perform GGTT restore much earlier during resume

2019-09-09 Thread Chris Wilson
As soon as we re-enable the various functions within the HW, they may go
off and read data via a GGTT offset. Hence, if we have not yet restored
the GGTT PTE before then, they may read and even *write* random locations
in memory.

Detected by DMAR faults during resume.

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Martin Peres 
Cc: Joonas Lahtinen 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c| 3 ---
 drivers/gpu/drm/i915/i915_drv.c   | 5 +
 drivers/gpu/drm/i915/selftests/i915_gem.c | 6 ++
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index b3993d24b83d..9b1129aaacfe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -242,9 +242,6 @@ void i915_gem_resume(struct drm_i915_private *i915)
mutex_lock(>drm.struct_mutex);
intel_uncore_forcewake_get(>uncore, FORCEWAKE_ALL);
 
-   i915_gem_restore_gtt_mappings(i915);
-   i915_gem_restore_fences(i915);
-
if (i915_gem_init_hw(i915))
goto err_wedged;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7b2c81a8bbaa..1af4eba968c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1877,6 +1877,11 @@ static int i915_drm_resume(struct drm_device *dev)
if (ret)
DRM_ERROR("failed to re-enable GGTT\n");
 
+   mutex_lock(_priv->drm.struct_mutex);
+   i915_gem_restore_gtt_mappings(dev_priv);
+   i915_gem_restore_fences(dev_priv);
+   mutex_unlock(_priv->drm.struct_mutex);
+
intel_csr_ucode_resume(dev_priv);
 
i915_restore_state(dev_priv);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c 
b/drivers/gpu/drm/i915/selftests/i915_gem.c
index bb6dd54a6ff3..37593831b539 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -118,6 +118,12 @@ static void pm_resume(struct drm_i915_private *i915)
with_intel_runtime_pm(>runtime_pm, wakeref) {
intel_gt_sanitize(>gt, false);
i915_gem_sanitize(i915);
+
+   mutex_lock(>drm.struct_mutex);
+   i915_gem_restore_gtt_mappings(i915);
+   i915_gem_restore_fences(i915);
+   mutex_unlock(>drm.struct_mutex);
+
i915_gem_resume(i915);
}
 }
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 6/6] iommu/intel: Ignore igfx_off

2019-09-09 Thread Chris Wilson
---
 drivers/iommu/intel-iommu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 34f6a3d93ae2..c98cdfd91691 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -439,8 +439,6 @@ static int __init intel_iommu_setup(char *str)
no_platform_optin = 1;
pr_info("IOMMU disabled\n");
} else if (!strncmp(str, "igfx_off", 8)) {
-   dmar_map_gfx = 0;
-   pr_info("Disable GFX device mapping\n");
} else if (!strncmp(str, "forcedac", 8)) {
pr_info("Forcing DAC for PCI devices\n");
dmar_forcedac = 1;
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI

2019-09-09 Thread Chris Wilson
We wish to avoid our presentation worker from being blocked by normal
workloads if we want to maintain an interactive frame update.

Signed-off-by: Chris Wilson 
Cc: Ville Syrjälä 
Cc: Heinrich Fink 
---
 drivers/gpu/drm/i915/display/intel_display.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 4ee750fa3ef0..cb55ab834a07 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -16148,7 +16148,8 @@ int intel_modeset_init(struct drm_device *dev)
struct intel_crtc *crtc;
int ret;
 
-   dev_priv->modeset_wq = alloc_ordered_workqueue("i915_modeset", 0);
+   dev_priv->modeset_wq =
+   alloc_ordered_workqueue("i915_modeset", WQ_HIGHPRI);
 
drm_mode_config_init(dev);
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [v2][PATCH 2/3] drm/i915/display: Extract i965_read_luts()

2019-09-09 Thread Swati Sharma
For i965, add hw read out to create hw blob of gamma
lut values.

Review comments from old series:
https://patchwork.freedesktop.org/series/58039/

v4:  -No need to initialize *blob [Jani]
 -Removed right shifts [Jani]
 -Dropped dev local var [Jani]
v5:  -Returned blob instead of assigning it internally
  within the function [Ville]
 -Renamed i965_get_color_config() to i965_read_lut() [Ville]
 -Renamed i965_get_gamma_config_10p6() to i965_read_gamma_lut_10p6()
  [Ville]
v9:  -Typo and 80 character limit [Uma]
 -Made read func para as const [Ville, Uma]
 -Renamed i965_read_gamma_lut_10p6() to i965_read_lut_10p6() [Ville, Uma]
v10: -Swapped ldw and udw while creating hw blob [Jani]
 -Added last index rgb lut value from PIPEGCMAX to h/w blob [Jani]

Signed-off-by: Swati Sharma 
---
 drivers/gpu/drm/i915/display/intel_color.c | 50 ++
 drivers/gpu/drm/i915/i915_reg.h|  4 +++
 2 files changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index 4d9a568..765f858 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -1570,6 +1570,55 @@ static void i9xx_read_luts(struct intel_crtc_state 
*crtc_state)
 }
 
 static struct drm_property_blob *
+i965_read_lut_10p6(const struct intel_crtc_state *crtc_state)
+{
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   u32 lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+   enum pipe pipe = crtc->pipe;
+   struct drm_property_blob *blob;
+   struct drm_color_lut *blob_data;
+   u32 i, val1, val2;
+
+   blob = drm_property_create_blob(_priv->drm,
+   sizeof(struct drm_color_lut) * lut_size,
+   NULL);
+   if (IS_ERR(blob))
+   return NULL;
+
+   blob_data = blob->data;
+
+   for (i = 0; i < lut_size - 1; i++) {
+   val1 = I915_READ(PALETTE(pipe, 2 * i + 0));
+   val2 = I915_READ(PALETTE(pipe, 2 * i + 1));
+
+   blob_data[i].red = REG_FIELD_GET(PALETTE_RED_MASK, val2) << 8 |
+
REG_FIELD_GET(PALETTE_RED_MASK, val1);
+   blob_data[i].green = REG_FIELD_GET(PALETTE_GREEN_MASK, val2) << 
8 |
+  
REG_FIELD_GET(PALETTE_GREEN_MASK, val1);
+   blob_data[i].blue = REG_FIELD_GET(PALETTE_BLUE_MASK, val2) << 8 
|
+ 
REG_FIELD_GET(PALETTE_BLUE_MASK, val1);
+   }
+
+   blob_data[i].red = REG_FIELD_GET(PIPEGCMAX_RGB_MASK,
+I915_READ(PIPEGCMAX(pipe, 0)));
+   blob_data[i].green = REG_FIELD_GET(PIPEGCMAX_RGB_MASK,
+  I915_READ(PIPEGCMAX(pipe, 1)));
+   blob_data[i].blue = REG_FIELD_GET(PIPEGCMAX_RGB_MASK,
+ I915_READ(PIPEGCMAX(pipe, 2)));
+
+   return blob;
+}
+
+static void i965_read_luts(struct intel_crtc_state *crtc_state)
+{
+   if (crtc_state->gamma_mode == GAMMA_MODE_MODE_8BIT)
+   crtc_state->base.gamma_lut = i9xx_read_lut_8(crtc_state);
+   else
+   crtc_state->base.gamma_lut = i965_read_lut_10p6(crtc_state);
+}
+
+static struct drm_property_blob *
 ilk_read_lut_10(const struct intel_crtc_state *crtc_state)
 {
struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
@@ -1672,6 +1721,7 @@ void intel_color_init(struct intel_crtc *crtc)
dev_priv->display.color_check = i9xx_color_check;
dev_priv->display.color_commit = i9xx_color_commit;
dev_priv->display.load_luts = i965_load_luts;
+   dev_priv->display.read_luts = i965_read_luts;
} else {
dev_priv->display.color_check = i9xx_color_check;
dev_priv->display.color_commit = i9xx_color_commit;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 45ed96d..5ac8a4d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3558,6 +3558,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define _PALETTE_A 0xa000
 #define _PALETTE_B 0xa800
 #define _CHV_PALETTE_C 0xc000
+#define PALETTE_RED_MASKREG_GENMASK(23, 16)
+#define PALETTE_GREEN_MASK  REG_GENMASK(15, 8)
+#define PALETTE_BLUE_MASK   REG_GENMASK(7, 0)
 #define PALETTE(pipe, i)   _MMIO(DISPLAY_MMIO_BASE(dev_priv) + \
  _PICK((pipe), _PALETTE_A, \
_PALETTE_B, _CHV_PALETTE_C) + \
@@ -5760,6 +5763,7 @@ enum {
 
 #define  _PIPEAGCMAX  

[Intel-gfx] [v2][PATCH 1/3] drm/i915/display: Add gamma precision function for CHV

2019-09-09 Thread Swati Sharma
intel_color_get_gamma_bit_precision() is extended for
cherryview by adding chv_gamma_precision(), i965 will use existing
i9xx_gamma_precision() func only.

Signed-off-by: Swati Sharma 
Reviewed-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_color.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index 6d641e1..4d9a568 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -1400,6 +1400,14 @@ static int ilk_gamma_precision(const struct 
intel_crtc_state *crtc_state)
}
 }
 
+static int chv_gamma_precision(const struct intel_crtc_state *crtc_state)
+{
+   if (crtc_state->cgm_mode & CGM_PIPE_MODE_GAMMA)
+   return 10;
+   else
+   return i9xx_gamma_precision(crtc_state);
+}
+
 static int glk_gamma_precision(const struct intel_crtc_state *crtc_state)
 {
switch (crtc_state->gamma_mode) {
@@ -1421,12 +1429,17 @@ int intel_color_get_gamma_bit_precision(const struct 
intel_crtc_state *crtc_stat
if (!crtc_state->gamma_enable)
return 0;
 
-   if (HAS_GMCH(dev_priv) && !IS_CHERRYVIEW(dev_priv))
-   return i9xx_gamma_precision(crtc_state);
-   else if (IS_CANNONLAKE(dev_priv) || IS_GEMINILAKE(dev_priv))
-   return glk_gamma_precision(crtc_state);
-   else if (IS_IRONLAKE(dev_priv))
-   return ilk_gamma_precision(crtc_state);
+   if (HAS_GMCH(dev_priv)) {
+   if (IS_CHERRYVIEW(dev_priv))
+   return chv_gamma_precision(crtc_state);
+   else
+   return i9xx_gamma_precision(crtc_state);
+   } else {
+   if (IS_CANNONLAKE(dev_priv) || IS_GEMINILAKE(dev_priv))
+   return glk_gamma_precision(crtc_state);
+   else if (IS_IRONLAKE(dev_priv))
+   return ilk_gamma_precision(crtc_state);
+   }
 
return 0;
 }
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [v2][PATCH 3/3] drm/i915/display: Extract chv_read_luts()

2019-09-09 Thread Swati Sharma
For cherryview, add hw read out to create hw blob of gamma
lut values.

Review comments from previous series:
https://patchwork.freedesktop.org/patch/328252

v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally within the
 function [Ville]
-Renamed function cherryview_get_color_config() to chv_read_luts()
-Renamed cherryview_get_gamma_config() to chv_read_cgm_gamma_lut()
 [Ville]
v9: -80 character limit [Uma]
-Made read func para as const [Ville, Uma]
-Renamed chv_read_cgm_gamma_lut() to chv_read_cgm_gamma_lut()
 [Ville, Uma]

Signed-off-by: Swati Sharma 
Reviewed-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_color.c | 43 ++
 drivers/gpu/drm/i915/i915_reg.h|  3 +++
 2 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index 765f858..318308d 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -1619,6 +1619,48 @@ static void i965_read_luts(struct intel_crtc_state 
*crtc_state)
 }
 
 static struct drm_property_blob *
+chv_read_cgm_lut(const struct intel_crtc_state *crtc_state)
+{
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   u32 lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+   enum pipe pipe = crtc->pipe;
+   struct drm_property_blob *blob;
+   struct drm_color_lut *blob_data;
+   u32 i, val;
+
+   blob = drm_property_create_blob(_priv->drm,
+   sizeof(struct drm_color_lut) * lut_size,
+   NULL);
+   if (IS_ERR(blob))
+   return NULL;
+
+   blob_data = blob->data;
+
+   for (i = 0; i < lut_size; i++) {
+   val = I915_READ(CGM_PIPE_GAMMA(pipe, i, 0));
+   blob_data[i].green = intel_color_lut_pack(REG_FIELD_GET(
+ 
CGM_PIPE_GAMMA_GREEN_MASK, val), 10);
+   blob_data[i].blue = intel_color_lut_pack(REG_FIELD_GET(
+
CGM_PIPE_GAMMA_BLUE_MASK, val), 10);
+
+   val = I915_READ(CGM_PIPE_GAMMA(pipe, i, 1));
+   blob_data[i].red = intel_color_lut_pack(REG_FIELD_GET(
+   
CGM_PIPE_GAMMA_RED_MASK, val), 10);
+   }
+
+   return blob;
+}
+
+static void chv_read_luts(struct intel_crtc_state *crtc_state)
+{
+   if (crtc_state->gamma_mode == GAMMA_MODE_MODE_8BIT)
+   crtc_state->base.gamma_lut = i9xx_read_lut_8(crtc_state);
+   else
+   crtc_state->base.gamma_lut = chv_read_cgm_lut(crtc_state);
+}
+
+static struct drm_property_blob *
 ilk_read_lut_10(const struct intel_crtc_state *crtc_state)
 {
struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
@@ -1717,6 +1759,7 @@ void intel_color_init(struct intel_crtc *crtc)
dev_priv->display.color_check = chv_color_check;
dev_priv->display.color_commit = i9xx_color_commit;
dev_priv->display.load_luts = chv_load_luts;
+   dev_priv->display.read_luts = chv_read_luts;
} else if (INTEL_GEN(dev_priv) >= 4) {
dev_priv->display.color_check = i9xx_color_check;
dev_priv->display.color_commit = i9xx_color_commit;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5ac8a4d..0241c9d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -10410,6 +10410,9 @@ enum skl_power_gate {
 #define   CGM_PIPE_MODE_GAMMA  (1 << 2)
 #define   CGM_PIPE_MODE_CSC(1 << 1)
 #define   CGM_PIPE_MODE_DEGAMMA(1 << 0)
+#define   CGM_PIPE_GAMMA_RED_MASK   REG_GENMASK(9, 0)
+#define   CGM_PIPE_GAMMA_GREEN_MASK REG_GENMASK(25, 16)
+#define   CGM_PIPE_GAMMA_BLUE_MASK  REG_GENMASK(9, 0)
 
 #define _CGM_PIPE_B_CSC_COEFF01(VLV_DISPLAY_BASE + 0x69900)
 #define _CGM_PIPE_B_CSC_COEFF23(VLV_DISPLAY_BASE + 0x69904)
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [v2][PATCH 0/3] adding gamma state checker for CHV and i965

2019-09-09 Thread Swati Sharma
In this patch series, added state checker to validate gamma lut values
for cherryview and i965 platforms. It's extension of the
patch series https://patchwork.freedesktop.org/patch/328246/?series=58039
which enabled the basic infrastructure and state checker for 
few legacy platforms.

v2: Added last index rgb lut value from PIPEGCMAX to h/w blob [Jani]

Swati Sharma (3):
  drm/i915/display: Add gamma precision function for CHV
  drm/i915/display: Extract i965_read_luts()
  drm/i915/display: Extract chv_read_luts()

 drivers/gpu/drm/i915/display/intel_color.c | 118 +++--
 drivers/gpu/drm/i915/i915_reg.h|   7 ++
 2 files changed, 119 insertions(+), 6 deletions(-)

-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 04/11] drm/i915/dsb: Indexed register write function for DSB.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

DSB can program large set of data through indexed register write
(opcode 0x9) in one shot. DSB feature can be used for bulk register
programming e.g. gamma lut programming, HDR meta data programming.

v1: initial version.
v2: simplified code by using ALIGN(). (Chris)
v3: ascii table added as code comment. (Shashank)

Cc: Shashank Sharma 
Cc: Imre Deak 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 64 
  drivers/gpu/drm/i915/display/intel_dsb.h |  8 +++
  2 files changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index 150be81fdfb3..0f55ed683d41 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -15,6 +15,7 @@
  #define DSB_OPCODE_INDEXED_WRITE  0x9
  #define DSB_BYTE_EN   0xF
  #define DSB_BYTE_EN_SHIFT 20
+#define DSB_REG_VALUE_MASK 0xf
  
  struct intel_dsb *

  intel_dsb_get(struct intel_crtc *crtc)
@@ -77,6 +78,69 @@ void intel_dsb_put(struct intel_dsb *dsb)
}
  }
  
+void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg,

+u32 val)
+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   u32 *buf = dsb->cmd_buf;
+   u32 reg_val;
+
+   if (!buf) {
+   I915_WRITE(reg, val);
+   return;
+   }
+
+   if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) {
+   DRM_DEBUG_KMS("DSB buffer overflow.\n");

Again, '.' in the end can be removed

+   return;
+   }
+
+   /*
+* For example the buffer will look like below for 3 dwords for auto
+* increment register:
+* ++
+* | size = 3 | offset &| value1 | value2 | value3 | zero   |
+* |  | opcode  |||||
+* ++
+* +  + +++++
+* 0  4 812   16   20   24
+* Byte
+*
+* As every instruction is 8 byte aligned the index of dsb instruction
+* will start always from even number while dealing with u32 array and
+* zero to be added for odd number of dwords at the last.


Let's split this comment in two parts, to make even more useful, like:

"As every instruction . array". "If we are 
writing odd no of dwords, Zeros will be added in the end for padding."


- Shashank


+*/
+   reg_val = buf[dsb->ins_start_offset + 1] & DSB_REG_VALUE_MASK;
+   if (reg_val != i915_mmio_reg_offset(reg)) {
+   /* Every instruction should be 8 byte aligned. */
+   dsb->free_pos = ALIGN(dsb->free_pos, 2);
+
+   dsb->ins_start_offset = dsb->free_pos;
+
+   /* Update the size. */
+   buf[dsb->free_pos++] = 1;
+
+   /* Update the opcode and reg. */
+   buf[dsb->free_pos++] = (DSB_OPCODE_INDEXED_WRITE  <<
+   DSB_OPCODE_SHIFT) |
+   i915_mmio_reg_offset(reg);
+
+   /* Update the value. */
+   buf[dsb->free_pos++] = val;
+   } else {
+   /* Update the new value. */
+   buf[dsb->free_pos++] = val;
+
+   /* Update the size. */
+   buf[dsb->ins_start_offset]++;
+   }
+
+   /* if number of data words is odd, then the last dword should be 0.*/
+   if (dsb->free_pos & 0x1)
+   buf[dsb->free_pos] = 0;
+}
+
  void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val)
  {
struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h 
b/drivers/gpu/drm/i915/display/intel_dsb.h
index 31b87dcfe160..9b2522f20bfb 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.h
+++ b/drivers/gpu/drm/i915/display/intel_dsb.h
@@ -29,11 +29,19 @@ struct intel_dsb {
 * and help in calculating tail of command buffer.
 */
int free_pos;
+
+   /*
+* ins_start_offset will help to store start address
+* of the dsb instuction of auto-increment register.
+*/
+   u32 ins_start_offset;
  };
  
  struct intel_dsb *

  intel_dsb_get(struct intel_crtc *crtc);
  void intel_dsb_put(struct intel_dsb *dsb);
  void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val);
+void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg,
+u32 val);
  
  #endif

___
Intel-gfx mailing list

Re: [Intel-gfx] [PATCH v5 07/11] drm/i915/dsb: function to trigger workload execution of DSB.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

Batch buffer will be created through dsb-reg-write function which can have
single/multiple request based on usecase and once the buffer is ready
commit function will trigger the execution of the batch buffer. All
the registers will be updated simultaneously.

v1: Initial version.
v2: Optimized code few places. (Chris)
v3: USed DRM_ERROR for dsb head/tail programming failure. (Shashank)

Cc: Imre Deak 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 42 
  drivers/gpu/drm/i915/display/intel_dsb.h |  1 +
  drivers/gpu/drm/i915/i915_reg.h  |  2 ++
  3 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index 56bf41b00f62..853685751540 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -213,3 +213,45 @@ void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t 
reg, u32 val)
   (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) |
   i915_mmio_reg_offset(reg);
  }
+
+void intel_dsb_commit(struct intel_dsb *dsb)
+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_device *dev = crtc->base.dev;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   enum pipe pipe = crtc->pipe;
+   u32 tail;
+
+   if (!dsb->free_pos)


I am seeing that in both success and failure cases, we are not returning 
anything. We have some error messages, but I would still like the caller 
to know if the commit was successful or not, with a return value. Do you 
think so Jani?


- Shashank


+   return;
+
+   if (!intel_dsb_enable_engine(dsb))
+   goto reset;
+
+   if (is_dsb_busy(dsb)) {
+   DRM_ERROR("HEAD_PTR write failed - dsb engine is busy.\n");
+   goto reset;
+   }
+   I915_WRITE(DSB_HEAD(pipe, dsb->id), i915_ggtt_offset(dsb->vma));
+
+   tail = ALIGN(dsb->free_pos * 4, CACHELINE_BYTES);
+   if (tail > dsb->free_pos * 4)
+   memset(>cmd_buf[dsb->free_pos], 0,
+  (tail - dsb->free_pos * 4));
+
+   if (is_dsb_busy(dsb)) {
+   DRM_ERROR("TAIL_PTR write failed - dsb engine is busy.\n");
+   goto reset;
+   }
+   DRM_DEBUG_KMS("DSB execution started - head 0x%x, tail 0x%x\n",
+ i915_ggtt_offset(dsb->vma), tail);
+   I915_WRITE(DSB_TAIL(pipe, dsb->id), i915_ggtt_offset(dsb->vma) + tail);
+   if (wait_for(!is_dsb_busy(dsb), 1)) {
+   DRM_ERROR("Timed out waiting for DSB workload completion.\n");
+   goto reset;
+   }
+
+reset:
+   dsb->free_pos = 0;
+   intel_dsb_disable_engine(dsb);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h 
b/drivers/gpu/drm/i915/display/intel_dsb.h
index 9b2522f20bfb..7389c8c5b665 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.h
+++ b/drivers/gpu/drm/i915/display/intel_dsb.h
@@ -43,5 +43,6 @@ void intel_dsb_put(struct intel_dsb *dsb);
  void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val);
  void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg,
 u32 val);
+void intel_dsb_commit(struct intel_dsb *dsb);
  
  #endif

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2df01386e3de..cfb78a2f94fe 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -11680,6 +11680,8 @@ enum skl_power_gate {
  #define _DSBSL_INSTANCE_BASE  0x70B00
  #define DSBSL_INSTANCE(pipe, id)  (_DSBSL_INSTANCE_BASE + \
 (pipe) * 0x1000 + (id) * 100)
+#define DSB_HEAD(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x0)
+#define DSB_TAIL(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x4)
  #define DSB_CTRL(pipe, id)_MMIO(DSBSL_INSTANCE(pipe, id) + 0x8)
  #define   DSB_ENABLE  (1 << 31)
  #define   DSB_STATUS  (1 << 0)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Show the logical context ring state on dumping

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] drm/i915: Show the logical context ring 
state on dumping
URL   : https://patchwork.freedesktop.org/series/66422/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6853 -> Patchwork_14323


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/

New tests
-

  New tests have been introduced between CI_DRM_6853 and Patchwork_14323:

### New IGT tests (1) ###

  * igt@i915_selftest@live_gt_lrc:
- Statuses : 45 pass(s)
- Exec time: [0.38, 2.02] s

  

Known issues


  Here are the changes found in Patchwork_14323 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_exec@basic:
- fi-icl-u3:  [PASS][1] -> [DMESG-WARN][2] ([fdo#107724])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@gem_ctx_e...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@gem_ctx_e...@basic.html

  * igt@i915_module_load@reload:
- fi-icl-u3:  [PASS][3] -> [DMESG-WARN][4] ([fdo#107724] / 
[fdo#111214])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@i915_module_l...@reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@i915_module_l...@reload.html

  * igt@kms_chamelium@dp-edid-read:
- fi-cml-u2:  [PASS][5] -> [FAIL][6] ([fdo#109483])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-cml-u2/igt@kms_chamel...@dp-edid-read.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-cml-u2/igt@kms_chamel...@dp-edid-read.html

  * igt@prime_vgem@basic-fence-flip:
- fi-ilk-650: [PASS][7] -> [DMESG-WARN][8] ([fdo#106387]) +1 
similar issue
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-ilk-650/igt@prime_v...@basic-fence-flip.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-ilk-650/igt@prime_v...@basic-fence-flip.html

  
 Possible fixes 

  * igt@gem_mmap_gtt@basic-write-read:
- fi-icl-u3:  [DMESG-WARN][9] ([fdo#107724]) -> [PASS][10] +2 
similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@gem_mmap_...@basic-write-read.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@gem_mmap_...@basic-write-read.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#106387]: https://bugs.freedesktop.org/show_bug.cgi?id=106387
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#109483]: https://bugs.freedesktop.org/show_bug.cgi?id=109483
  [fdo#111214]: https://bugs.freedesktop.org/show_bug.cgi?id=111214
  [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593


Participating hosts (53 -> 47)
--

  Additional (1): fi-kbl-soraka 
  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y 
fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6853 -> Patchwork_14323

  CI-20190529: 20190529
  CI_DRM_6853: ad1a8a60aba111d2c186d19391d5a17bd09ab48b @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14323: 4fe867b7f2167bd9534b401eeba31c56b2ecaeed @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4fe867b7f216 drm/i915/selftests: Verify the LRC register layout between init 
and HW
3f29fa89a00e drm/i915: Show the logical context ring state on dumping

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] system freeze on i915 system(s) due to commit aa56a292ce623734ddd30f52d73f527d1f3529b5

2019-09-09 Thread howaboutsynergy

‐‐‐ Original Message ‐‐‐
On Monday, September 9, 2019 10:38 AM,  wrote:

> With commit aa56a292ce623734ddd30f52d73f527d1f3529b5 (even on 5.3.0-rc8) I 
> can get a system freeze during chromium compilation (likely due to jumbo / 
> high memory usage). Sysrq still works and CPU/fan is low, so it seems like a 
> deadlock? and there's no disk reading. I can't read the dump gotten via kdump 
> for some reason, else I would've shown a stacktrace by causing kernel to 
> crash via sysrq+c.
>
> I can easily reproduce this freeze in a matter of seconds:
>
> please see https://bugzilla.kernel.org/show_bug.cgi?id=203317#c4
>
> Thanks.

Filed https://bugs.freedesktop.org/show_bug.cgi?id=111601

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/ringbuffer: Flush writes before RING_TAIL update

2019-09-09 Thread Chris Wilson
Be paranoid and make sure we flush any and all writes out of the WCB
before performing the UC mmio to update the RING_TAIL. (An UC write
should itself be enough to do the flush, hence the paranoia here.) Quite
infrequently, we see problems where the GPU seems to overshoot the
RING_TAIL and so executes garbage hence the speculation.

References: https://bugs.freedesktop.org/show_bug.cgi?id=111598
References: https://bugs.freedesktop.org/show_bug.cgi?id=111417
References: https://bugs.freedesktop.org/show_bug.cgi?id=111034
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index bbda85dcaa42..73c3ffc80218 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -930,6 +930,7 @@ static void cancel_requests(struct intel_engine_cs *engine)
 static void i9xx_submit_request(struct i915_request *request)
 {
i915_request_submit(request);
+   wmb(); /* paranoid flush writes out of the WCB before mmio */
 
ENGINE_WRITE(request->engine, RING_TAIL,
 intel_ring_set_tail(request->ring, request->tail));
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to 
extend execbuf2
URL   : https://patchwork.freedesktop.org/series/66418/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6852_full -> Patchwork_14322_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_14322_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14322_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_14322_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_universal_plane@cursor-fb-leak-pipe-a:
- shard-snb:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-snb4/igt@kms_universal_pl...@cursor-fb-leak-pipe-a.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-snb5/igt@kms_universal_pl...@cursor-fb-leak-pipe-a.html

  * igt@perf@enable-disable:
- shard-kbl:  [PASS][3] -> [DMESG-WARN][4] +3 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-kbl1/igt@p...@enable-disable.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-kbl1/igt@p...@enable-disable.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
- shard-apl:  [PASS][5] -> [DMESG-WARN][6] +1 similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-apl5/igt@p...@gen8-unprivileged-single-ctx-counters.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-apl1/igt@p...@gen8-unprivileged-single-ctx-counters.html
- shard-glk:  [PASS][7] -> [DMESG-WARN][8] +2 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-glk2/igt@p...@gen8-unprivileged-single-ctx-counters.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-glk8/igt@p...@gen8-unprivileged-single-ctx-counters.html
- shard-iclb: [PASS][9] -> [DMESG-WARN][10] +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb8/igt@p...@gen8-unprivileged-single-ctx-counters.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb5/igt@p...@gen8-unprivileged-single-ctx-counters.html

  * igt@perf@mi-rpc:
- shard-skl:  NOTRUN -> [DMESG-WARN][11]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-skl3/igt@p...@mi-rpc.html

  
Known issues


  Here are the changes found in Patchwork_14322_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_schedule@preempt-other-chain-bsd:
- shard-iclb: [PASS][12] -> [SKIP][13] ([fdo#111325]) +5 similar 
issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb3/igt@gem_exec_sched...@preempt-other-chain-bsd.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb4/igt@gem_exec_sched...@preempt-other-chain-bsd.html

  * igt@i915_pm_rpm@system-suspend-modeset:
- shard-iclb: [PASS][14] -> [INCOMPLETE][15] ([fdo#107713] / 
[fdo#108840])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb6/igt@i915_pm_...@system-suspend-modeset.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb2/igt@i915_pm_...@system-suspend-modeset.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
- shard-apl:  [PASS][16] -> [DMESG-WARN][17] ([fdo#108566]) +6 
similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-apl8/igt@i915_susp...@fence-restore-tiled2untiled.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-apl7/igt@i915_susp...@fence-restore-tiled2untiled.html

  * igt@kms_cursor_crc@pipe-c-cursor-64x21-sliding:
- shard-iclb: [PASS][18] -> [INCOMPLETE][19] ([fdo#107713])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb6/igt@kms_cursor_...@pipe-c-cursor-64x21-sliding.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb7/igt@kms_cursor_...@pipe-c-cursor-64x21-sliding.html

  * igt@kms_cursor_legacy@cursor-vs-flip-toggle:
- shard-hsw:  [PASS][20] -> [INCOMPLETE][21] ([fdo#103540])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-hsw8/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-hsw6/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html

  * igt@kms_draw_crc@draw-method-rgb565-pwrite-untiled:
- shard-snb:  [PASS][22] -> [SKIP][23] ([fdo#109271]) +3 similar 
issues
   [22]: 

Re: [Intel-gfx] [PATCH v5 03/11] drm/i915/dsb: single register write function for DSB.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

DSB support single register write through opcode 0x1. Generic
api created which accumulate all single register write in a batch
buffer and once DSB is triggered, it will program all the registers
at the same time.

v1: Initial version.
v2: Unused macro removed and cosmetic changes done. (Shashank)

Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 30 
  drivers/gpu/drm/i915/display/intel_dsb.h |  9 +++
  2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index cba5c8d37659..150be81fdfb3 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -9,6 +9,13 @@
  
  #define DSB_BUF_SIZE(2 * PAGE_SIZE)
  
+/* DSB opcodes. */

+#define DSB_OPCODE_SHIFT   24
+#define DSB_OPCODE_MMIO_WRITE  0x1
+#define DSB_OPCODE_INDEXED_WRITE   0x9
+#define DSB_BYTE_EN0xF
+#define DSB_BYTE_EN_SHIFT  20
+
  struct intel_dsb *
  intel_dsb_get(struct intel_crtc *crtc)
  {
@@ -46,6 +53,7 @@ intel_dsb_get(struct intel_crtc *crtc)
goto err;
}
dsb->vma = vma;
+   dsb->free_pos = 0;

This should be done in dsb_put();
  
  err:

intel_runtime_pm_put(>runtime_pm, wakeref);
@@ -68,3 +76,25 @@ void intel_dsb_put(struct intel_dsb *dsb)
mutex_unlock(>drm.struct_mutex);
}
  }
+
+void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val)
+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   u32 *buf = dsb->cmd_buf;
+
+   if (!buf) {
+   I915_WRITE(reg, val);
+   return;
+   }
+
+   if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) {
+   DRM_DEBUG_KMS("DSB buffer overflow.\n");


Lets remove this '.' in the end, to maintain consistency in the log.

- Shashank


+   return;
+   }
+
+   buf[dsb->free_pos++] = val;
+   buf[dsb->free_pos++] = (DSB_OPCODE_MMIO_WRITE  << DSB_OPCODE_SHIFT) |
+  (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) |
+  i915_mmio_reg_offset(reg);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h 
b/drivers/gpu/drm/i915/display/intel_dsb.h
index 27eb68eb5392..31b87dcfe160 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.h
+++ b/drivers/gpu/drm/i915/display/intel_dsb.h
@@ -6,6 +6,8 @@
  #ifndef _INTEL_DSB_H
  #define _INTEL_DSB_H
  
+#include "i915_reg.h"

+
  struct intel_crtc;
  struct i915_vma;
  
@@ -21,10 +23,17 @@ struct intel_dsb {

enum dsb_id id;
u32 *cmd_buf;
struct i915_vma *vma;
+
+   /*
+* free_pos will point the first free entry position
+* and help in calculating tail of command buffer.
+*/
+   int free_pos;
  };
  
  struct intel_dsb *

  intel_dsb_get(struct intel_crtc *crtc);
  void intel_dsb_put(struct intel_dsb *dsb);
+void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val);
  
  #endif

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: cleanup cache-coloring

2019-09-09 Thread Chris Wilson
Quoting Matthew Auld (2019-09-09 13:40:52)
> Try to tidy up the cache-coloring such that we rid the code of any
> mm.color_adjust assumptions, this should hopefully make it more obvious
> in the code when we need to actually use the cache-level as the color,
> and as a bonus should make adding a different color-scheme simpler.
> 
> Signed-off-by: Matthew Auld 
> Cc: Chris Wilson 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 

Series is
Reviewed-by: Chris Wilson 
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/tgl: Add sysfs interface to control class-of-service

2019-09-09 Thread Joonas Lahtinen
Quoting Prathap Kumar Valsan (2019-08-26 01:48:01)
> To provide shared last-level-cache isolation to cpu workloads running
> concurrently with gpu workloads, the gpu allocation of cache lines needs
> to be restricted to certain ways. Currently GPU hardware supports four
> class-of-service(CLOS) levels and there is an associated way-mask for
> each CLOS.
> 
> Hardware supports reading supported way-mask configuration for GPU using
> a bios pcode interface. The supported way-masks and the one currently
> active is communicated to userspace via a sysfs file--closctrl. Admin user
> can then select a new mask by writing the mask value to the file.
> 
> Note of Caution: Restricting cache ways using this mechanism presents a
> larger attack surface for side-channel attacks.

I wonder if this is enough to justify some further protection before
enabling?

> Example usage:
> The active way-mask is highlighted within square brackets.
> > cat /sys/class/drm/card0/closctrl
> [0x] 0xff00 0xc000 0x8000

How about two files for easier scripting interface?

/sys/class/drm/card0/llc_clos
/sys/class/drm/card0/llc_clos_modes

Regards, Joonas
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Remove incorrect BUG_ON for schedule-out

2019-09-09 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-09-09 11:23:56)
> 
> On 07/09/2019 11:50, Chris Wilson wrote:
> > As we may unwind incomplete requests (for preemption) prior to
> > processing the CSB and the schedule-out events, we may update rq->engine
> > (resetting it to point back to the parent virtual engine) prior to
> > calling execlists_schedule_out(), invalidating the assertion that the
> > request still points to the inflight engine. (The likelihood of this is
> > increased if the CSB interrupt processing is pushed to the ksoftirqd for
> > being too slow and direct submission overtakes it.)
> > 
> > Reported-by: Vinay Belgaumkar 
> > Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the 
> > irq-off spinlock")
> > Signed-off-by: Chris Wilson 
> > Cc: Mika Kuoppala 
> > Cc: Tvrtko Ursulin 
> > Cc: Vinay Belgaumkar 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c | 1 -
> >   1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
> > b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 3aad35b570d4..16f226349525 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -631,7 +631,6 @@ execlists_schedule_out(struct i915_request *rq)
> >   struct intel_engine_cs *cur, *old;
> >   
> >   trace_i915_request_out(rq);
> > - GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
> >   
> >   old = READ_ONCE(ce->inflight);
> >   do
> > 
> 
> So unwind from direct submission resets rq->engine and races with 
> process_csb from the tasklet which notices request has actually 
> completed?

Yup. That's nice and succinct compared to my waffle.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915: Show the logical context ring state on dumping

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] drm/i915: Show the logical context ring 
state on dumping
URL   : https://patchwork.freedesktop.org/series/66422/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
3f29fa89a00e drm/i915: Show the logical context ring state on dumping
4fe867b7f216 drm/i915/selftests: Verify the LRC register layout between init 
and HW
-:60: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'x' - possible side-effects?
#60: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:473:
+#define REG(x) (((x) >> 2) | BUILD_BUG_ON_ZERO(x >= 0x200))

-:61: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in 
parentheses
#61: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:474:
+#define REG16(x) \
+   (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \
+   (((x) >> 2) & 0x7f)

-:61: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'x' - possible side-effects?
#61: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:474:
+#define REG16(x) \
+   (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \
+   (((x) >> 2) & 0x7f)

total: 1 errors, 0 warnings, 2 checks, 1085 lines checked

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Remove incorrect BUG_ON for schedule-out

2019-09-09 Thread Tvrtko Ursulin


On 07/09/2019 11:50, Chris Wilson wrote:

As we may unwind incomplete requests (for preemption) prior to
processing the CSB and the schedule-out events, we may update rq->engine
(resetting it to point back to the parent virtual engine) prior to
calling execlists_schedule_out(), invalidating the assertion that the
request still points to the inflight engine. (The likelihood of this is
increased if the CSB interrupt processing is pushed to the ksoftirqd for
being too slow and direct submission overtakes it.)

Reported-by: Vinay Belgaumkar 
Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off 
spinlock")
Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
Cc: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_lrc.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3aad35b570d4..16f226349525 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -631,7 +631,6 @@ execlists_schedule_out(struct i915_request *rq)
struct intel_engine_cs *cur, *old;
  
  	trace_i915_request_out(rq);

-   GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
  
  	old = READ_ONCE(ce->inflight);

do



So unwind from direct submission resets rq->engine and races with 
process_csb from the tasklet which notices request has actually 
completed? Seems to hold true in code.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] system freeze on i915 system(s) due to commit aa56a292ce623734ddd30f52d73f527d1f3529b5

2019-09-09 Thread howaboutsynergy
With commit aa56a292ce623734ddd30f52d73f527d1f3529b5 (even on 5.3.0-rc8) I can 
get a system freeze during chromium compilation (likely due to jumbo / high 
memory usage). Sysrq still works and CPU/fan is low, so it seems like a 
deadlock? and there's no disk reading. I can't read the dump gotten via kdump 
for some reason, else I would've shown a stacktrace by causing kernel to crash 
via sysrq+c.

I can easily reproduce this freeze in a matter of seconds:

please see https://bugzilla.kernel.org/show_bug.cgi?id=203317#c4

Thanks.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 05/11] drm/i915/dsb: Check DSB engine status.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

As per bspec check for DSB status before programming any
of its register. Inline function added to check the dsb status.

Cc: Michel Thierry 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 9 +
  drivers/gpu/drm/i915/i915_reg.h  | 7 +++
  2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index 0f55ed683d41..2c8415518c65 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -17,6 +17,15 @@
  #define DSB_BYTE_EN_SHIFT 20
  #define DSB_REG_VALUE_MASK0xf
  
+static inline bool is_dsb_busy(struct intel_dsb *dsb)

+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   enum pipe pipe = crtc->pipe;
+
+   return DSB_STATUS & I915_READ(DSB_CTRL(pipe, dsb->id));
+}
+
  struct intel_dsb *
  intel_dsb_get(struct intel_crtc *crtc)
  {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 006cffd56be2..a3099f712ae6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -11676,4 +11676,11 @@ enum skl_power_gate {
  #define PORT_TX_DFLEXDPCSSS(fia)  _MMIO_FIA((fia), 0x00894)
  #define   DP_PHY_MODE_STATUS_NOT_SAFE(tc_port)(1 << (tc_port))
  
+/* This register controls the Display State Buffer (DSB) engines. */

+#define _DSBSL_INSTANCE_BASE   0x70B00
+#define DSBSL_INSTANCE(pipe, id)   (_DSBSL_INSTANCE_BASE + \
+(pipe) * 0x1000 + (id) * 100)


Why is pipe in () ?

- Shashank


+#define DSB_CTRL(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x8)
+#define   DSB_STATUS   (1 << 0)
+
  #endif /* _I915_REG_H_ */

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 06/11] drm/i915/dsb: functions to enable/disable DSB engine.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

DSB will be used for performance improvement for some special scenario.
DSB engine will be enabled based on need and after completion of its work
will be disabled. Api added for enable/disable operation by using DSB_CTRL
register.

v1: Initial version.
v2: POSTING_READ added after writing control register. (Shashank)

Cc: Michel Thierry 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 42 
  drivers/gpu/drm/i915/i915_reg.h  |  1 +
  2 files changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index 2c8415518c65..56bf41b00f62 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -26,6 +26,48 @@ static inline bool is_dsb_busy(struct intel_dsb *dsb)
return DSB_STATUS & I915_READ(DSB_CTRL(pipe, dsb->id));
  }
  
+static inline bool intel_dsb_enable_engine(struct intel_dsb *dsb)

+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   enum pipe pipe = crtc->pipe;
+   u32 dsb_ctrl;
+
+   dsb_ctrl = I915_READ(DSB_CTRL(pipe, dsb->id));
+

This space not required.

+   if (DSB_STATUS & dsb_ctrl) {
+   DRM_DEBUG_KMS("DSB engine is busy.\n");
+   return false;
+   }
+
+   dsb_ctrl |= DSB_ENABLE;
+   I915_WRITE(DSB_CTRL(pipe, dsb->id), dsb_ctrl);
+
+   POSTING_READ(DSB_CTRL(pipe, dsb->id));
+   return true;
+}
+
+static inline bool intel_dsb_disable_engine(struct intel_dsb *dsb)
+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   enum pipe pipe = crtc->pipe;
+   u32 dsb_ctrl;
+
+   dsb_ctrl = I915_READ(DSB_CTRL(pipe, dsb->id));
+

Same here.

+   if (DSB_STATUS & dsb_ctrl) {
+   DRM_DEBUG_KMS("DSB engine is busy.\n");
+   return false;
+   }
+
+   dsb_ctrl &= ~DSB_ENABLE;
+   I915_WRITE(DSB_CTRL(pipe, dsb->id), dsb_ctrl);
+
+   POSTING_READ(DSB_CTRL(pipe, dsb->id));
+   return true;
+}
+
  struct intel_dsb *
  intel_dsb_get(struct intel_crtc *crtc)
  {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a3099f712ae6..2df01386e3de 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -11681,6 +11681,7 @@ enum skl_power_gate {
  #define DSBSL_INSTANCE(pipe, id)  (_DSBSL_INSTANCE_BASE + \
 (pipe) * 0x1000 + (id) * 100)
  #define DSB_CTRL(pipe, id)_MMIO(DSBSL_INSTANCE(pipe, id) + 0x8)
+#define   DSB_ENABLE   (1 << 31)
  #define   DSB_STATUS  (1 << 0)
  
  #endif /* _I915_REG_H_ */

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 08/11] drm/i915/dsb: added dsb refcount to synchronize between get/put.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

The lifetime of command buffer can be controlled by the dsb user
throuh refcount. Added refcount mechanism is dsb get/put call
which create/destroy dsb context.

Cc: Jani Nikula 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 22 --
  drivers/gpu/drm/i915/display/intel_dsb.h |  1 +
  2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
index 853685751540..b951a6b5264a 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -78,7 +78,12 @@ intel_dsb_get(struct intel_crtc *crtc)
struct intel_dsb *dsb = >dsb;
intel_wakeref_t wakeref;
  
-	if ((!HAS_DSB(i915)) || dsb->cmd_buf)

+   if (!HAS_DSB(i915))
+   return dsb;
+
+   atomic_inc(>refcount);
+


As discussed we are not solving any problem with reference counting, 
rather, we are adding a complexity here. It may be useful, when we are 
extending single instance of DSB to DSB pool but not right now.


I would say we drop this patch all together, and just have the simple 
implementation now.


- Shashank


+   if (dsb->cmd_buf)
return dsb;
  
  	dsb->id = DSB1;

@@ -94,6 +99,7 @@ intel_dsb_get(struct intel_crtc *crtc)
if (IS_ERR(vma)) {
DRM_ERROR("Vma creation failed.\n");
i915_gem_object_put(obj);
+   atomic_dec(>refcount);
goto err;
}
  
@@ -102,6 +108,7 @@ intel_dsb_get(struct intel_crtc *crtc)

DRM_ERROR("Command buffer creation failed.\n");
i915_vma_unpin_and_release(, 0);
dsb->cmd_buf = NULL;
+   atomic_dec(>refcount);
goto err;
}
dsb->vma = vma;
@@ -121,11 +128,14 @@ void intel_dsb_put(struct intel_dsb *dsb)
return;
  
  	if (dsb->cmd_buf) {

-   mutex_lock(>drm.struct_mutex);
-   i915_gem_object_unpin_map(dsb->vma->obj);
-   i915_vma_unpin_and_release(>vma, 0);
-   dsb->cmd_buf = NULL;
-   mutex_unlock(>drm.struct_mutex);
+   atomic_dec(>refcount);
+   if (!atomic_read(>refcount)) {
+   mutex_lock(>drm.struct_mutex);
+   i915_gem_object_unpin_map(dsb->vma->obj);
+   i915_vma_unpin_and_release(>vma, 0);
+   dsb->cmd_buf = NULL;
+   mutex_unlock(>drm.struct_mutex);
+   }
}
  }
  
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h

index 7389c8c5b665..dca4e632dd3c 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.h
+++ b/drivers/gpu/drm/i915/display/intel_dsb.h
@@ -20,6 +20,7 @@ enum dsb_id {
  };
  
  struct intel_dsb {

+   atomic_t refcount;
enum dsb_id id;
u32 *cmd_buf;
struct i915_vma *vma;

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v4 3/7] drm: Add DisplayPort colorspace property

2019-09-09 Thread Ville Syrjälä
On Sat, Sep 07, 2019 at 11:19:55PM +, Mun, Gwan-gyeong wrote:
> On Fri, 2019-09-06 at 09:24 -0400, Ilia Mirkin wrote:
> > On Fri, Sep 6, 2019 at 7:43 AM Ville Syrjälä
> >  wrote:
> > > On Fri, Sep 06, 2019 at 11:31:55AM +, Shankar, Uma wrote:
> > > > 
> > > > > -Original Message-
> > > > > From: Ilia Mirkin 
> > > > > Sent: Tuesday, September 3, 2019 6:12 PM
> > > > > To: Mun, Gwan-gyeong 
> > > > > Cc: Intel Graphics Development  > > > > >; Shankar, Uma
> > > > > ; dri-devel <
> > > > > dri-de...@lists.freedesktop.org>
> > > > > Subject: Re: [PATCH v4 3/7] drm: Add DisplayPort colorspace
> > > > > property
> > > > > 
> > > > > So how would this work with a DP++ connector? Should it list
> > > > > the HDMI or DP
> > > > > properties? Or do we need a custom property checker which is
> > > > > aware of what is
> > > > > currently plugged in to validate the values?
> > > > 
> > > > AFAIU For DP++ cases, we detect what kind of sink its driving DP
> > > > or HDMI (with a passive dongle).
> > > > Based on the type of sink detected, we should expose DP or HDMI
> > > > colorspaces to userspace.
> > > 
> > > For i915 DP connector always drives DP mode, HDMI connector always
> > > drives
> > > HDMI mode, even when the physical connector is DP++.
> > 
> > Right, i915 creates 2 connectors, while nouveau, radeon, and amdgpu
> > create 1 connector (not sure about other drivers) for a single
> > physical DP++ socket. Since we supply the list of valid values at the
> > time of creating the connector, we can't know at that point whether
> > in
> > the future a HDMI or DP will be plugged into it.
> > 
> >   -ilia
> Ilia, does it mean that the drm_connector type is
> DRM_MODE_CONNECTOR_DisplayPort and protocol is DP++ mode?
> 
> And Ville and Uma,  when we are useing dp active dongle (DP to HDMI
> dongle and DP branch device is HDMI) should we expose HDMI colorspace?

We still set it up via DP MSA/VSC no? In that case it should follow the
DP spec I think. LSPCON is probably different because we manually generate
the AVI infoframe for it. But I'm not sure how we're going to reconcile
that with the DP stuff we also set up for it.

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/3] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust

2019-09-09 Thread Matthew Auld
Make it clear that the color adjust callback applies to the ggtt.

Signed-off-by: Matthew Auld 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 10 +-
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 095f5e358a58..48688d683e95 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2547,10 +2547,10 @@ static int ggtt_set_pages(struct i915_vma *vma)
return 0;
 }
 
-static void i915_gtt_color_adjust(const struct drm_mm_node *node,
- unsigned long color,
- u64 *start,
- u64 *end)
+static void i915_ggtt_color_adjust(const struct drm_mm_node *node,
+  unsigned long color,
+  u64 *start,
+  u64 *end)
 {
if (i915_node_color_differs(node, color))
*start += I915_GTT_PAGE_SIZE;
@@ -3206,7 +3206,7 @@ static int ggtt_init_hw(struct i915_ggtt *ggtt)
ggtt->vm.has_read_only = IS_VALLEYVIEW(i915);
 
if (!HAS_LLC(i915) && !HAS_PPGTT(i915))
-   ggtt->vm.mm.color_adjust = i915_gtt_color_adjust;
+   ggtt->vm.mm.color_adjust = i915_ggtt_color_adjust;
 
if (!io_mapping_init_wc(>iomap,
ggtt->gmadr.start,
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index cb30c669b1b7..fca38167bdce 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -276,7 +276,7 @@ static int igt_evict_for_cache_color(void *arg)
 
/* Currently the use of color_adjust is limited to cache domains within
 * the ggtt, and so the presence of mm.color_adjust is assumed to be
-* i915_gtt_color_adjust throughout our driver, so using a mock color
+* i915_ggtt_color_adjust throughout our driver, so using a mock color
 * adjust will work just fine for our purposes.
 */
ggtt->vm.mm.color_adjust = mock_color_adjust;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/3] drm/i915: cleanup cache-coloring

2019-09-09 Thread Matthew Auld
Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.

Signed-off-by: Matthew Auld 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  6 +++--
 drivers/gpu/drm/i915/i915_drv.h   |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c | 12 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  6 +
 drivers/gpu/drm/i915/i915_vma.c   | 22 +--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   | 10 +
 7 files changed, 34 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 6af740a5e3db..da3e7cf12aa1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -294,8 +294,10 @@ int i915_gem_object_set_cache_level(struct 
drm_i915_gem_object *obj,
}
}
 
-   list_for_each_entry(vma, >vma.list, obj_link)
-   vma->node.color = cache_level;
+   list_for_each_entry(vma, >vma.list, obj_link) {
+   if (i915_vm_has_cache_coloring(vma->vm))
+   vma->node.color = cache_level;
+   }
i915_gem_object_set_cache_coherency(obj, cache_level);
obj->cache_dirty = true; /* Always invalidate stale cachelines */
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db7480831e52..e289b4ffd34b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2364,7 +2364,7 @@ i915_gem_context_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
  u64 min_size, u64 alignment,
- unsigned cache_level,
+ unsigned long color,
  u64 start, u64 end,
  unsigned flags);
 int __must_check i915_gem_evict_for_node(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 52c86c6e0673..e76c9da9992d 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -70,7 +70,7 @@ mark_free(struct drm_mm_scan *scan,
  * @vm: address space to evict from
  * @min_size: size of the desired free space
  * @alignment: alignment constraint of the desired free space
- * @cache_level: cache_level for the desired space
+ * @color: color for the desired space
  * @start: start (inclusive) of the range from which to evict objects
  * @end: end (exclusive) of the range from which to evict objects
  * @flags: additional flags to control the eviction algorithm
@@ -91,7 +91,7 @@ mark_free(struct drm_mm_scan *scan,
 int
 i915_gem_evict_something(struct i915_address_space *vm,
 u64 min_size, u64 alignment,
-unsigned cache_level,
+unsigned long color,
 u64 start, u64 end,
 unsigned flags)
 {
@@ -124,7 +124,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
if (flags & PIN_MAPPABLE)
mode = DRM_MM_INSERT_LOW;
drm_mm_scan_init_with_range(, >mm,
-   min_size, alignment, cache_level,
+   min_size, alignment, color,
start, end, mode);
 
/*
@@ -266,7 +266,6 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
u64 start = target->start;
u64 end = start + target->size;
struct i915_vma *vma, *next;
-   bool check_color;
int ret = 0;
 
lockdep_assert_held(>i915->drm.struct_mutex);
@@ -283,8 +282,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
if (!(flags & PIN_NONBLOCK))
i915_retire_requests(vm->i915);
 
-   check_color = vm->mm.color_adjust;
-   if (check_color) {
+   if (i915_vm_has_cache_coloring(vm)) {
/* Expand search to cover neighbouring guard pages (or lack!) */
if (start)
start -= I915_GTT_PAGE_SIZE;
@@ -310,7 +308,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 * abutt and conflict. If they are in conflict, then we evict
 * those as well to make room for our guard pages.
 */
-   if (check_color) {
+   if (i915_vm_has_cache_coloring(vm)) {
if (node->start + node->size == 

[Intel-gfx] [PATCH 1/3] drm/i915: export color_differs

2019-09-09 Thread Matthew Auld
Export color_differs so that we can use it elsewhere.

Signed-off-by: Matthew Auld 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  2 +-
 drivers/gpu/drm/i915/i915_vma.c | 11 ---
 drivers/gpu/drm/i915/i915_vma.h |  6 ++
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 906dc6fff383..095f5e358a58 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2552,7 +2552,7 @@ static void i915_gtt_color_adjust(const struct 
drm_mm_node *node,
  u64 *start,
  u64 *end)
 {
-   if (node->allocated && node->color != color)
+   if (i915_node_color_differs(node, color))
*start += I915_GTT_PAGE_SIZE;
 
/* Also leave a space between the unallocated reserved node after the
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e0e677b2a3a9..a90bd2678353 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -477,11 +477,6 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma *vma)
vma->flags &= ~I915_VMA_CAN_FENCE;
 }
 
-static bool color_differs(struct drm_mm_node *node, unsigned long color)
-{
-   return node->allocated && node->color != color;
-}
-
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level)
 {
struct drm_mm_node *node = >node;
@@ -502,11 +497,13 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, 
unsigned long cache_level)
GEM_BUG_ON(list_empty(>node_list));
 
other = list_prev_entry(node, node_list);
-   if (color_differs(other, cache_level) && !drm_mm_hole_follows(other))
+   if (i915_node_color_differs(other, cache_level) &&
+   !drm_mm_hole_follows(other))
return false;
 
other = list_next_entry(node, node_list);
-   if (color_differs(other, cache_level) && !drm_mm_hole_follows(node))
+   if (i915_node_color_differs(other, cache_level) &&
+   !drm_mm_hole_follows(node))
return false;
 
return true;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 889fc7cb910a..5b1e0cf7669d 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -373,6 +373,12 @@ static inline bool i915_vma_is_bound(const struct i915_vma 
*vma,
return vma->flags & where;
 }
 
+static inline bool i915_node_color_differs(const struct drm_mm_node *node,
+  unsigned long color)
+{
+   return node->allocated && node->color != color;
+}
+
 /**
  * i915_vma_pin_iomap - calls ioremap_wc to map the GGTT VMA via the aperture
  * @vma: VMA to iomap
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 02/11] drm/i915/dsb: DSB context creation.

2019-09-09 Thread Sharma, Shashank


On 9/7/2019 4:37 PM, Animesh Manna wrote:

This patch adds a function, which will internally get the gem buffer
for DSB engine. The GEM buffer is from global GTT, and is mapped into
CPU domain, contains the data + opcode to be feed to DSB engine.

v1: Initial version.

v2:
- removed some unwanted code. (Chris)
- Used i915_gem_object_create_internal instead of _shmem. (Chris)
- cmd_buf_tail removed and can be derived through vma object. (Chris)

v3: vma realeased if i915_gem_object_pin_map() failed. (Shashank)

v4: for simplification and based on current usage added single dsb
object in intel_crtc. (Shashank)

Cc: Imre Deak 
Cc: Michel Thierry 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/Makefile |  1 +
  .../drm/i915/display/intel_display_types.h|  3 +
  drivers/gpu/drm/i915/display/intel_dsb.c  | 70 +++
  drivers/gpu/drm/i915/display/intel_dsb.h  | 30 
  drivers/gpu/drm/i915/i915_drv.h   |  1 +
  5 files changed, 105 insertions(+)
  create mode 100644 drivers/gpu/drm/i915/display/intel_dsb.c
  create mode 100644 drivers/gpu/drm/i915/display/intel_dsb.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 658b930d34a8..6313e7b4bd78 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -172,6 +172,7 @@ i915-y += \
display/intel_display_power.o \
display/intel_dpio_phy.o \
display/intel_dpll_mgr.o \
+   display/intel_dsb.o \
display/intel_fbc.o \
display/intel_fifo_underrun.o \
display/intel_frontbuffer.o \
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index d5cc4b810d9e..49c902b00484 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -1033,6 +1033,9 @@ struct intel_crtc {
  
  	/* scalers available on this crtc */

int num_scalers;
+
+   /* per pipe DSB related info */
+   struct intel_dsb dsb;
  };
  
  struct intel_plane {

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c
new file mode 100644
index ..cba5c8d37659
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ */
+
+#include "i915_drv.h"
+#include "intel_display_types.h"
+
+#define DSB_BUF_SIZE(2 * PAGE_SIZE)
+
+struct intel_dsb *
+intel_dsb_get(struct intel_crtc *crtc)
+{
+   struct drm_device *dev = crtc->base.dev;
+   struct drm_i915_private *i915 = to_i915(dev);
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   struct intel_dsb *dsb = >dsb;
+   intel_wakeref_t wakeref;
+
+   if ((!HAS_DSB(i915)) || dsb->cmd_buf)
+   return dsb;
+
+   dsb->id = DSB1;
+   wakeref = intel_runtime_pm_get(>runtime_pm);
+
+   obj = i915_gem_object_create_internal(i915, DSB_BUF_SIZE);
+   if (IS_ERR(obj))
+   goto err;
+
+   mutex_lock(>drm.struct_mutex);
+   vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE);
+   mutex_unlock(>drm.struct_mutex);
+   if (IS_ERR(vma)) {
+   DRM_ERROR("Vma creation failed.\n");
+   i915_gem_object_put(obj);
+   goto err;
+   }
+
+   dsb->cmd_buf = i915_gem_object_pin_map(vma->obj, I915_MAP_WC);
+   if (IS_ERR(dsb->cmd_buf)) {
+   DRM_ERROR("Command buffer creation failed.\n");
+   i915_vma_unpin_and_release(, 0);
+   dsb->cmd_buf = NULL;
+   goto err;
+   }
+   dsb->vma = vma;
+
+err:
+   intel_runtime_pm_put(>runtime_pm, wakeref);
+   return dsb;
+}
+
+void intel_dsb_put(struct intel_dsb *dsb)
+{
+   struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+   struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+
+   if (!dsb)
+   return;
+
+   if (dsb->cmd_buf) {
+   mutex_lock(>drm.struct_mutex);
+   i915_gem_object_unpin_map(dsb->vma->obj);
+   i915_vma_unpin_and_release(>vma, 0);
+   dsb->cmd_buf = NULL;


This can be done outside mutex_unlock();

- Shashank


+   mutex_unlock(>drm.struct_mutex);
+   }
+}
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h 
b/drivers/gpu/drm/i915/display/intel_dsb.h
new file mode 100644
index ..27eb68eb5392
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_dsb.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef _INTEL_DSB_H
+#define _INTEL_DSB_H
+
+struct intel_crtc;
+struct i915_vma;
+
+enum dsb_id {
+   INVALID_DSB = -1,
+   DSB1,
+   DSB2,
+   DSB3,
+   MAX_DSB_PER_PIPE
+};
+
+struct intel_dsb {
+   enum 

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI

2019-09-09 Thread Patchwork
== Series Details ==

Series: drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI
URL   : https://patchwork.freedesktop.org/series/66439/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6854_full -> Patchwork_14330_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_14330_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_reloc@basic-cpu-active:
- shard-skl:  [PASS][1] -> [DMESG-WARN][2] ([fdo#106107])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl7/igt@gem_exec_re...@basic-cpu-active.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl4/igt@gem_exec_re...@basic-cpu-active.html

  * igt@gem_exec_schedule@preempt-queue-bsd1:
- shard-iclb: [PASS][3] -> [SKIP][4] ([fdo#109276]) +11 similar 
issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb1/igt@gem_exec_sched...@preempt-queue-bsd1.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb8/igt@gem_exec_sched...@preempt-queue-bsd1.html

  * igt@gem_exec_schedule@preemptive-hang-bsd:
- shard-iclb: [PASS][5] -> [SKIP][6] ([fdo#111325]) +3 similar 
issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb5/igt@gem_exec_sched...@preemptive-hang-bsd.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb1/igt@gem_exec_sched...@preemptive-hang-bsd.html

  * igt@i915_suspend@forcewake:
- shard-skl:  [PASS][7] -> [INCOMPLETE][8] ([fdo#104108])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl7/igt@i915_susp...@forcewake.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl1/igt@i915_susp...@forcewake.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
- shard-iclb: [PASS][9] -> [INCOMPLETE][10] ([fdo#107713] / 
[fdo#109507])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_f...@flip-vs-suspend-interruptible.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb7/igt@kms_f...@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-render:
- shard-iclb: [PASS][11] -> [FAIL][12] ([fdo#103167]) +1 similar 
issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb7/igt@kms_frontbuffer_track...@fbc-1p-primscrn-spr-indfb-draw-render.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb2/igt@kms_frontbuffer_track...@fbc-1p-primscrn-spr-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@psr-suspend:
- shard-skl:  [PASS][13] -> [INCOMPLETE][14] ([fdo#104108] / 
[fdo#106978])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl9/igt@kms_frontbuffer_track...@psr-suspend.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl6/igt@kms_frontbuffer_track...@psr-suspend.html

  * igt@kms_plane@pixel-format-pipe-b-planes:
- shard-hsw:  [PASS][15] -> [INCOMPLETE][16] ([fdo#103540])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-hsw7/igt@kms_pl...@pixel-format-pipe-b-planes.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-hsw4/igt@kms_pl...@pixel-format-pipe-b-planes.html

  * igt@kms_psr2_su@frontbuffer:
- shard-iclb: [PASS][17] -> [SKIP][18] ([fdo#109642] / [fdo#111068])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_psr2...@frontbuffer.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb6/igt@kms_psr2...@frontbuffer.html

  * igt@kms_psr@psr2_sprite_mmap_gtt:
- shard-iclb: [PASS][19] -> [SKIP][20] ([fdo#109441]) +1 similar 
issue
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_gtt.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb5/igt@kms_psr@psr2_sprite_mmap_gtt.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
- shard-apl:  [PASS][21] -> [DMESG-WARN][22] ([fdo#108566]) +1 
similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-apl2/igt@kms_vbl...@pipe-a-ts-continuation-suspend.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-apl8/igt@kms_vbl...@pipe-a-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-b-ts-continuation-suspend:
- shard-kbl:  [PASS][23] -> [INCOMPLETE][24] ([fdo#103665])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-kbl2/igt@kms_vbl...@pipe-b-ts-continuation-suspend.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-kbl6/igt@kms_vbl...@pipe-b-ts-continuation-suspend.html

  
 Possible fixes 

  

Re: [Intel-gfx] [PATCH 9/9] drm/i915: Expand subslice mask

2019-09-09 Thread Summers, Stuart
On Fri, 2019-09-06 at 19:13 +0100, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-02 14:42:44)
> > 
> > On 24/07/2019 14:05, Tvrtko Ursulin wrote:
> > > 
> > > On 23/07/2019 16:49, Stuart Summers wrote:
> > > > +u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu,
> > > > u8 slice)
> > > > +{
> > > > +int i, offset = slice * sseu->ss_stride;
> > > > +u32 mask = 0;
> > > > +
> > > > +if (slice >= sseu->max_slices) {
> > > > +DRM_ERROR("%s: invalid slice %d, max: %d\n",
> > > > +  __func__, slice, sseu->max_slices);
> > > > +return 0;
> > > > +}
> > > > +
> > > > +if (sseu->ss_stride > sizeof(mask)) {
> > > > +DRM_ERROR("%s: invalid subslice stride %d, max:
> > > > %lu\n",
> > > > +  __func__, sseu->ss_stride, sizeof(mask));
> > > > +return 0;
> > > > +}
> > > > +
> > > > +for (i = 0; i < sseu->ss_stride; i++)
> > > > +mask |= (u32)sseu->subslice_mask[offset + i] <<
> > > > +i * BITS_PER_BYTE;
> > > > +
> > > > +return mask;
> > > > +}
> > > 
> > > Why do you actually need these complications when the plan from
> > > the 
> > > start was that the driver and user sseu representation structures
> > > can be 
> > > different?
> > > 
> > > I only gave it a quick look so I might be wrong, but why not just
> > > expand 
> > > the driver representations of subslice mask up from u8? Userspace
> > > API 
> > > should be able to cope with strides already.
> > 
> > I never got an answer to this and the series was merged in the
> > meantime.

Thanks for the note here Tvrtko and sorry for the missed response! For
some reason I hadn't caught this comment earlier :(

> > 
> > Maybe not much harm but I still don't understand why all the 
> > complications seemingly just to avoid bumping the *internal* ss
> > mask up 
> > from u8. As long as the internal and abi sseu info struct are well 
> > separated and access point few and well controlled (I think they
> > are) 
> > then I don't see why the internal side had to be converted to u8
> > and 
> > strides. But maybe I am missing something.
> 
> I looked at it and thought it was open-coding bitmap.h as well. I
> accepted it in good faith that it improved certain use cases and
> should
> even make tidying up the code without regressing those easier.

The goal here is to make sure we have an infrastructure in place that
always provides a consistent bit layout to userspace regardless of
underlying architecture endianness. Perhaps this could have been made
more clear in the commit message here.

Thanks,
Stuart

> -Chris


smime.p7s
Description: S/MIME cryptographic signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled

2019-09-09 Thread Daniele Ceraolo Spurio



On 9/7/19 1:39 AM, Chris Wilson wrote:

Quoting Daniele Ceraolo Spurio (2019-09-06 23:28:05)



On 9/5/19 2:09 AM, Janusz Krzysztofik wrote:

When trying to reset a device with reset capability disabled or not
supported while rings are full of requests, it has been observed when
running in execlists submission mode that command stream buffer tail
tends to be incremented by apparently still running GPU regardless of
all requests being already cancelled and command stream buffer pointers
reset.  As a result, kernel panic on NULL pointer dereference occurs
when a trace_ports() helper is called with command stream buffer tail
incremented but request pointers being NULL during final
__intel_gt_set_wedged() operation called from intel_gt_reset().

Skip actual reset procedure if reset is disabled or not supported.


This last sentence is a bit confusing. You're not skipping the reset
procedure, you're skipping the attempt of unwedging and resetting again
after a reset & wedge already happened.


Loss of email over the last week, so jumping in at the end. My gut
response is that this is still just papering over the bug, as what you
say above makes no sense.
-Chris



The issue here is that if we don't reset the HW when we wedge, whatever 
was running on the engines might complete at any point after that, which 
generates an unexpected post-wedge CSB event that we don't handle 
gracefully when we unwedge. The CSB event might arrive at any time (even 
after the unwedge) or cause weird behavior on the first re-submission, 
so trying to handle it is not worth the effort IMO since having reset 
disabled is a debug-only use-case.


Daniele
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v5 03/11] drm/i915/dsb: single register write function for DSB.

2019-09-09 Thread Animesh Manna



On 9/9/2019 6:28 PM, Sharma, Shashank wrote:


On 9/7/2019 4:37 PM, Animesh Manna wrote:

DSB support single register write through opcode 0x1. Generic
api created which accumulate all single register write in a batch
buffer and once DSB is triggered, it will program all the registers
at the same time.

v1: Initial version.
v2: Unused macro removed and cosmetic changes done. (Shashank)

Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Cc: Shashank Sharma 
Signed-off-by: Animesh Manna 
---
  drivers/gpu/drm/i915/display/intel_dsb.c | 30 
  drivers/gpu/drm/i915/display/intel_dsb.h |  9 +++
  2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c 
b/drivers/gpu/drm/i915/display/intel_dsb.c

index cba5c8d37659..150be81fdfb3 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -9,6 +9,13 @@
#define DSB_BUF_SIZE(2 * PAGE_SIZE)
  +/* DSB opcodes. */
+#define DSB_OPCODE_SHIFT24
+#define DSB_OPCODE_MMIO_WRITE0x1
+#define DSB_OPCODE_INDEXED_WRITE0x9
+#define DSB_BYTE_EN0xF
+#define DSB_BYTE_EN_SHIFT20
+
  struct intel_dsb *
  intel_dsb_get(struct intel_crtc *crtc)
  {
@@ -46,6 +53,7 @@ intel_dsb_get(struct intel_crtc *crtc)
  goto err;
  }
  dsb->vma = vma;
+dsb->free_pos = 0;

This should be done in dsb_put();

err:
  intel_runtime_pm_put(>runtime_pm, wakeref);
@@ -68,3 +76,25 @@ void intel_dsb_put(struct intel_dsb *dsb)
  mutex_unlock(>drm.struct_mutex);
  }
  }
+
+void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 
val)

+{
+struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb);
+struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+u32 *buf = dsb->cmd_buf;
+
+if (!buf) {
+I915_WRITE(reg, val);
+return;
+}
+
+if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) {
+DRM_DEBUG_KMS("DSB buffer overflow.\n");


Lets remove this '.' in the end, to maintain consistency in the log.


Sure.

Regards,
Animesh

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: include GTT page-size info in error state

2019-09-09 Thread Matthew Auld
It might prove useful in the future to know if the vma is utilising
huge-GTT-pages. Related to this is the GTT cache, where there is some HW
"quirkiness" where it must be disabled if using 2M pages, so include
that for good measure.

Suggested-by: Chris Wilson 
Signed-off-by: Matthew Auld 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_object_types.h |  1 -
 drivers/gpu/drm/i915/i915_gpu_error.c| 10 ++
 drivers/gpu/drm/i915/i915_gpu_error.h|  2 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 13b9dc0e1a89..a558edf15ec8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -160,7 +160,6 @@ struct drm_i915_gem_object {
struct sg_table *pages;
void *mapping;
 
-   /* TODO: whack some of this into the error state */
struct i915_page_sizes {
/**
 * The sg mask of the pages sg_table. i.e the mask of
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 3ccf7fd9307f..6384a06aa5bf 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -575,6 +575,9 @@ static void print_error_obj(struct drm_i915_error_state_buf 
*m,
   lower_32_bits(obj->gtt_offset));
}
 
+   if (obj->gtt_page_sizes > I915_GTT_PAGE_SIZE_4K)
+   err_printf(m, "gtt_page_sizes = 0x%08x\n", obj->gtt_page_sizes);
+
err_compression_marker(m);
for (page = 0; page < obj->page_count; page++) {
int i, len;
@@ -735,6 +738,9 @@ static void __err_print_to_sgl(struct 
drm_i915_error_state_buf *m,
if (IS_GEN(m->i915, 7))
err_printf(m, "ERR_INT: 0x%08x\n", error->err_int);
 
+   if (IS_GEN_RANGE(m->i915, 8, 11))
+   err_printf(m, "GTT_CACHE_EN: 0x%08x\n", error->gtt_cache);
+
for (ee = error->engine; ee; ee = ee->next)
error_print_engine(m, ee, error->epoch);
 
@@ -985,6 +991,7 @@ i915_error_object_create(struct drm_i915_private *i915,
 
dst->gtt_offset = vma->node.start;
dst->gtt_size = vma->node.size;
+   dst->gtt_page_sizes = vma->page_sizes.gtt;
dst->num_pages = num_pages;
dst->page_count = 0;
dst->unused = 0;
@@ -1554,6 +1561,9 @@ static void capture_reg_state(struct i915_gpu_state 
*error)
error->gac_eco = intel_uncore_read(uncore, GAC_ECO_BITS);
}
 
+   if (IS_GEN_RANGE(i915, 8, 11))
+   error->gtt_cache = intel_uncore_read(uncore, HSW_GTT_CACHE_EN);
+
/* 4: Everything else */
if (INTEL_GEN(i915) >= 11) {
error->ier = intel_uncore_read(uncore, GEN8_DE_MISC_IER);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h 
b/drivers/gpu/drm/i915/i915_gpu_error.h
index df9f57766626..63cf387411e0 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -74,6 +74,7 @@ struct i915_gpu_state {
u32 gam_ecochk;
u32 gab_ctl;
u32 gfx_mode;
+   u32 gtt_cache;
 
u32 nfence;
u64 fence[I915_MAX_NUM_FENCES];
@@ -127,6 +128,7 @@ struct i915_gpu_state {
struct drm_i915_error_object {
u64 gtt_offset;
u64 gtt_size;
+   u32 gtt_page_sizes;
int num_pages;
int page_count;
int unused;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to 
extend execbuf2
URL   : https://patchwork.freedesktop.org/series/66418/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6852 -> Patchwork_14322


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/

Known issues


  Here are the changes found in Patchwork_14322 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_switch@legacy-render:
- fi-bxt-dsi: [PASS][1] -> [INCOMPLETE][2] ([fdo#103927] / 
[fdo#111381])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-bxt-dsi/igt@gem_ctx_swi...@legacy-render.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-bxt-dsi/igt@gem_ctx_swi...@legacy-render.html

  * igt@i915_selftest@live_gem_contexts:
- fi-skl-guc: [PASS][3] -> [INCOMPLETE][4] ([fdo#111519])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-skl-guc/igt@i915_selftest@live_gem_contexts.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-skl-guc/igt@i915_selftest@live_gem_contexts.html

  * igt@prime_vgem@basic-fence-flip:
- fi-ilk-650: [PASS][5] -> [DMESG-WARN][6] ([fdo#106387]) +1 
similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-ilk-650/igt@prime_v...@basic-fence-flip.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-ilk-650/igt@prime_v...@basic-fence-flip.html

  
 Possible fixes 

  * igt@gem_ctx_create@basic-files:
- fi-icl-u2:  [INCOMPLETE][7] ([fdo#107713] / [fdo#109100]) -> 
[PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u2/igt@gem_ctx_cre...@basic-files.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u2/igt@gem_ctx_cre...@basic-files.html

  * igt@gem_exec_fence@nb-await-default:
- fi-icl-u3:  [DMESG-WARN][9] ([fdo#107724]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u3/igt@gem_exec_fe...@nb-await-default.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u3/igt@gem_exec_fe...@nb-await-default.html

  * igt@kms_frontbuffer_tracking@basic:
- fi-icl-u3:  [FAIL][11] ([fdo#103167]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u3/igt@kms_frontbuffer_track...@basic.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u3/igt@kms_frontbuffer_track...@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#106387]: https://bugs.freedesktop.org/show_bug.cgi?id=106387
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381
  [fdo#111519]: https://bugs.freedesktop.org/show_bug.cgi?id=111519


Participating hosts (52 -> 46)
--

  Additional (2): fi-skl-6770hq fi-skl-6700k2 
  Missing(8): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks 
fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6852 -> Patchwork_14322

  CI-20190529: 20190529
  CI_DRM_6852: d45d78ff950be956657e1236785714509a7d43be @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5173: 3fb0f227d8856008f89a797879e27094745ce97e @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14322: 668a68776eddfd3b529fe98c2babe5d7ce2da381 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

668a68776edd drm/i915: add support for perf configuration queries
fc2e0e5e7126 drm/i915/perf: allow holding preemption on filtered ctx
8a629929451c drm/i915: add a new perf configuration execbuf parameter
647d2458b7c3 drm/i915/perf: execute OA configuration from command stream
0432b1e0d15d drm/i915: add wait flags to i915_active_request_retire
4ef9530de87e drm/i915/perf: implement active wait for noa configurations
da1c41cf2065 drm/i915/perf: allow for CS OA configs to be created lazily
8bb8be52ca97 drm/i915/perf: move perf types to their own header
8db92539084e drm/i915/perf: introduce a versioning of the i915-perf uapi
8aca4673ec28 drm/i915/perf: store the associated engine of a stream
66b65143aa4d drm/i915/perf: drop list of streams
503c88dc3bc0 drm/i915: add syncobj timeline support
66b565b57b3f drm/i915: introduce a mechanism to extend execbuf2

== Logs ==

For more details see: 

[Intel-gfx] [PATCH v16 04/13] drm/i915/perf: store the associated engine of a stream

2019-09-09 Thread Lionel Landwerlin
We'll use this information later to verify that a client trying to
reconfigure the stream does so on the right engine.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  | 5 +
 drivers/gpu/drm/i915/i915_perf.c | 7 +++
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 75607450ba00..274a1193d4f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1088,6 +1088,11 @@ struct i915_perf_stream {
 */
intel_wakeref_t wakeref;
 
+   /**
+* @engine: Engine associated with this performance stream.
+*/
+   struct intel_engine_cs *engine;
+
/**
 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
 * properties given when opening a stream, representing the contents
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index d18cd332afb7..9d5a3522aa35 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -363,6 +363,8 @@ struct perf_open_properties {
int oa_format;
bool oa_periodic;
int oa_period_exponent;
+
+   struct intel_engine_cs *engine;
 };
 
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
@@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
 
format_size = dev_priv->perf.oa_formats[props->oa_format].size;
 
+   stream->engine = props->engine;
+
stream->sample_flags |= SAMPLE_OA_REPORT;
stream->sample_size += format_size;
 
@@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct 
drm_i915_private *dev_priv,
return -EINVAL;
}
 
+   /* At the moment we only support using i915-perf on the RCS. */
+   props->engine = dev_priv->engine[RCS0];
+
/* Considering that ID = 0 is reserved and assuming that we don't
 * (currently) expect any configurations to ever specify duplicate
 * values for a particular property ID then the last _PROP_MAX value is
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v16 09/13] drm/i915: add wait flags to i915_active_request_retire

2019-09-09 Thread Lionel Landwerlin
An upcoming change needs not to be interrupted.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_active.c | 4 +++-
 drivers/gpu/drm/i915/i915_active.h | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 6a447f1d0110..c808c28c9464 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref)
break;
}
 
-   err = i915_active_request_retire(>base, BKL(ref));
+   err = i915_active_request_retire(>base,
+I915_WAIT_INTERRUPTIBLE,
+BKL(ref));
if (err)
break;
}
diff --git a/drivers/gpu/drm/i915/i915_active.h 
b/drivers/gpu/drm/i915/i915_active.h
index f95058f99057..35a6089b44fd 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request 
*active)
  */
 static inline int __must_check
 i915_active_request_retire(struct i915_active_request *active,
+  unsigned int flags,
   struct mutex *mutex)
 {
struct i915_request *request;
@@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request 
*active,
if (!request)
return 0;
 
-   ret = i915_request_wait(request,
-   I915_WAIT_INTERRUPTIBLE,
-   MAX_SCHEDULE_TIMEOUT);
+   ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT);
if (ret < 0)
return ret;
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v16 08/13] drm/i915/perf: implement active wait for noa configurations

2019-09-09 Thread Lionel Landwerlin
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).

v2: Don't forget to save/restore registers used for the wait (Chris)

v3: Name used CS_GPR registers (Chris)
Fix compile issue due to rebase (Lionel)

v4: Fix save/restore helpers (Umesh)

v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)

v6: Add missing struct declarations in i915_perf.h

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  24 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h |   5 +
 drivers/gpu/drm/i915/i915_debugfs.c  |  30 +++
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_perf.c | 233 ++-
 drivers/gpu/drm/i915/i915_perf_types.h   |   6 +
 drivers/gpu/drm/i915/i915_reg.h  |   4 +-
 7 files changed, 300 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index b6373fbc927d..fab318c71d24 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -160,6 +160,7 @@
 #define   MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */
 #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1)
 #define   MI_BATCH_RESOURCE_STREAMER (1<<10)
+#define   MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/
 
 /*
  * 3D instructions used by the kernel
@@ -238,6 +239,29 @@
 #define   PIPE_CONTROL_DEPTH_CACHE_FLUSH   (1<<0)
 #define   PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
 
+#define MI_MATH(x) MI_INSTR(0x1a, (x) - 1)
+#define   MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2))
+/* operands */
+#define   MI_ALU_OP_NOOP 0
+#define   MI_ALU_OP_LOAD 128
+#define   MI_ALU_OP_LOADINV  1152
+#define   MI_ALU_OP_LOAD0129
+#define   MI_ALU_OP_LOAD11153
+#define   MI_ALU_OP_ADD  256
+#define   MI_ALU_OP_SUB  257
+#define   MI_ALU_OP_AND  258
+#define   MI_ALU_OP_OR   259
+#define   MI_ALU_OP_XOR  260
+#define   MI_ALU_OP_STORE384
+#define   MI_ALU_OP_STOREINV 1408
+/* sources */
+#define   MI_ALU_SRC_REG(x)  (x) /* 0 -> 15 */
+#define   MI_ALU_SRC_SRCA32
+#define   MI_ALU_SRC_SRCB33
+#define   MI_ALU_SRC_ACCU49
+#define   MI_ALU_SRC_ZF  50
+#define   MI_ALU_SRC_CF  51
+
 /*
  * Commands used only by the command parser
  */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index dc295c196d11..f752b6cf9ea1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -97,6 +97,11 @@ enum intel_gt_scratch_field {
/* 8 bytes */
INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256,
 
+   /* 6 * 8 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048,
+
+   /* 4 bytes */
+   INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096,
 };
 
 #endif /* __INTEL_GT_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 708855e051b5..b00b1a6f8d68 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3578,6 +3578,35 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops,
i915_wedged_get, i915_wedged_set,
"%llu\n");
 
+static int
+i915_perf_noa_delay_set(void *data, u64 val)
+{
+   struct drm_i915_private *i915 = data;
+
+   /* This would lead to infinite waits as we're doing timestamp
+* difference on the CS with only 32bits.
+*/
+   if (val > mul_u32_u32(U32_MAX, 
RUNTIME_INFO(i915)->cs_timestamp_frequency_khz))
+   return -EINVAL;
+
+   atomic64_set(>perf.noa_programming_delay, val);
+   return 0;
+}
+
+static int
+i915_perf_noa_delay_get(void *data, u64 *val)
+{
+   struct drm_i915_private *i915 = data;
+
+   *val = atomic64_read(>perf.noa_programming_delay);
+   return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops,
+   i915_perf_noa_delay_get,
+   i915_perf_noa_delay_set,
+   "%llu\n");
+
 #define DROP_UNBOUND   BIT(0)
 #define DROP_BOUND BIT(1)
 #define DROP_RETIREBIT(2)
@@ -4354,6 +4383,7 @@ static const struct i915_debugfs_files {
const char *name;
const struct file_operations *fops;
 } i915_debugfs_files[] = {
+   {"i915_perf_noa_delay", _perf_noa_delay_fops},
{"i915_wedged", _wedged_fops},
{"i915_cache_sharing", _cache_sharing_fops},
{"i915_gem_drop_caches", _drop_caches_fops},
diff --git 

[Intel-gfx] [PATCH 2/2] drm/i915/selftests: Verify the LRC register layout between init and HW

2019-09-09 Thread Chris Wilson
Before we submit the first context to HW, we need to construct a valid
image of the register state. This layout is defined by the HW and should
match the layout generated by HW when it saves the context image.
Asserting that this should be equivalent should help avoid any undefined
behaviour and verify that we haven't missed anything important!

Of course, having insisted that the initial register state within the
LRC should match that returned by HW, we need to ensure that it does.

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 656 --
 drivers/gpu/drm/i915/gt/intel_lrc_reg.h   |  62 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 140 
 drivers/gpu/drm/i915/i915_perf.c  |  35 +-
 drivers/gpu/drm/i915/i915_perf.h  |   5 +-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 7 files changed, 638 insertions(+), 263 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f1c0e5d958f3..3eb3c4fab110 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1115,7 +1115,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq,
 
offset = i915_ggtt_offset(ce->state) +
 LRC_STATE_PN * PAGE_SIZE +
-(CTX_R_PWR_CLK_STATE + 1) * 4;
+CTX_R_PWR_CLK_STATE * 4;
 
*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
*cs++ = lower_32_bits(offset);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 0ddfbebbcbbc..e369dba3c06a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -230,9 +230,9 @@ static int __execlists_context_alloc(struct intel_context 
*ce,
 struct intel_engine_cs *engine);
 
 static void execlists_init_reg_state(u32 *reg_state,
-struct intel_context *ce,
-struct intel_engine_cs *engine,
-struct intel_ring *ring);
+const struct intel_context *ce,
+const struct intel_engine_cs *engine,
+const struct intel_ring *ring);
 
 static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine)
 {
@@ -464,6 +464,411 @@ lrc_descriptor(struct intel_context *ce, struct 
intel_engine_cs *engine)
return desc;
 }
 
+static u32 *set_offsets(u32 *regs,
+   const u8 *data,
+   const struct intel_engine_cs *engine)
+#define NOP(x) (BIT(7) | (x))
+#define LRI(count, flags) ((flags) << 6 | (count))
+#define POSTED BIT(0)
+#define REG(x) (((x) >> 2) | BUILD_BUG_ON_ZERO(x >= 0x200))
+#define REG16(x) \
+   (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \
+   (((x) >> 2) & 0x7f)
+#define END() 0
+{
+   const u32 base = engine->mmio_base;
+
+   while (*data) {
+   u8 count, flags;
+
+   if (*data & BIT(7)) { /* skip */
+   regs += *data++ & ~BIT(7);
+   continue;
+   }
+
+   count = *data & 0x3f;
+   flags = *data >> 6;
+   data++;
+
+   *regs = MI_LOAD_REGISTER_IMM(count);
+   if (flags & POSTED)
+   *regs |= MI_LRI_FORCE_POSTED;
+   if (INTEL_GEN(engine->i915) >= 11)
+   *regs |= MI_LRI_CS_MMIO;
+   regs++;
+
+   GEM_BUG_ON(!count);
+   do {
+   u32 offset = 0;
+   u8 v;
+
+   do {
+   v = *data++;
+   offset <<= 7;
+   offset |= v & ~BIT(7);
+   } while (v & BIT(7));
+
+   *regs = base + (offset << 2);
+   regs += 2;
+   } while (--count);
+   }
+
+   return regs;
+}
+
+static const u8 gen8_xcs_offsets[] = {
+   NOP(1),
+   LRI(11, 0),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x11c),
+   REG(0x114),
+   REG(0x118),
+
+   NOP(9),
+   LRI(9, 0),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   NOP(13),
+   LRI(2, 0),
+   REG16(0x200),
+   REG(0x028),
+
+   END(),
+};
+
+static const u8 gen9_xcs_offsets[] = {
+   NOP(1),
+   LRI(14, POSTED),
+   REG16(0x244),
+   

[Intel-gfx] [PATCH v16 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi

2019-09-09 Thread Lionel Landwerlin
Reporting this version will help application figure out what level of
the support the running kernel provides.

v2: Add i915_perf_ioctl_version() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_getparam.c |  4 
 drivers/gpu/drm/i915/i915_perf.c | 10 ++
 drivers/gpu/drm/i915/i915_perf.h |  1 +
 include/uapi/drm/i915_drm.h  | 20 
 4 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index da6faa84e5b8..bd41cc5ce906 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -5,6 +5,7 @@
 #include "gt/intel_engine_user.h"
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 
 int i915_getparam_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
@@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_MMAP_GTT_COHERENT:
value = INTEL_INFO(i915)->has_coherent_ggtt;
break;
+   case I915_PARAM_PERF_REVISION:
+   value = i915_perf_ioctl_version();
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9d5a3522aa35..40a1ec2bc96b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 
dev_priv->perf.initialized = false;
 }
+
+/**
+ * i915_perf_ioctl_version - Version of the i915-perf subsystem
+ *
+ * This version number is used by userspace to detect available features.
+ */
+int i915_perf_ioctl_version(void)
+{
+   return 1;
+}
diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h
index a412b16d9ffc..95549de65212 100644
--- a/drivers/gpu/drm/i915/i915_perf.h
+++ b/drivers/gpu/drm/i915/i915_perf.h
@@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915);
 void i915_perf_fini(struct drm_i915_private *i915);
 void i915_perf_register(struct drm_i915_private *i915);
 void i915_perf_unregister(struct drm_i915_private *i915);
+int i915_perf_ioctl_version(void);
 
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3d031e81648b..e98c9a7baa91 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54
 
+/*
+ * Revision of the i915-perf uAPI. The value returned helps determine what
+ * i915-perf features are available. See drm_i915_perf_property_id.
+ */
+#define I915_PARAM_PERF_REVISION   55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id {
 * Open the stream for a specific context handle (as used with
 * execbuffer2). A stream opened for a specific context this way
 * won't typically require root privileges.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_CTX_HANDLE = 1,
 
/**
 * A value of 1 requests the inclusion of raw OA unit reports as
 * part of stream samples.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_SAMPLE_OA,
 
/**
 * The value specifies which set of OA unit metrics should be
 * be configured, defining the contents of any OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_METRICS_SET,
 
/**
 * The value specifies the size and layout of OA unit reports.
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_FORMAT,
 
@@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id {
 * from this exponent as follows:
 *
 *   80ns * 2^(period_exponent + 1)
+*
+* This property is available in perf revision 1.
 */
DRM_I915_PERF_PROP_OA_EXPONENT,
 
@@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param {
  * to close and re-open a stream with the same configuration.
  *
  * It's undefined whether any pending data for the stream will be lost.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0)
 
@@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param {
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_DISABLE_IO('i', 

[Intel-gfx] [PATCH v16 13/13] drm/i915: add support for perf configuration queries

2019-09-09 Thread Lionel Landwerlin
Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
Fix unlocking when not locked issue (Lionel)
Add debug messages (Lionel)

v4: Fix missing unlock (Dan)

v5: Drop lock when copying config content to userspace (Chris)

v6: Drop lock when copying config list to userspace (Chris)
Fix deadlock when calling i915_perf_get_oa_config() under
perf.metrics_lock (Lionel)
Add i915_oa_config_get() (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/i915_perf.c  |   3 +
 drivers/gpu/drm/i915/i915_query.c | 282 ++
 include/uapi/drm/i915_drm.h   |  65 ++-
 4 files changed, 353 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2c6f37219dff..eab42269fc5b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1368,6 +1368,12 @@ struct drm_i915_private {
 */
struct idr metrics_idr;
 
+   /*
+* Number of dynamic configurations, you need to hold
+* dev_priv->perf.metrics_lock to access it.
+*/
+   u32 n_metrics;
+
/*
 * Lock associated with anything below within this structure
 * except exclusive_stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0ffcb8d16154..cf392e4d6870 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3917,6 +3917,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
goto sysfs_err;
}
 
+   dev_priv->perf.n_metrics++;
+
mutex_unlock(_priv->perf.metrics_lock);
 
DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
@@ -3977,6 +3979,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, 
void *data,
   _config->sysfs_metric);
 
idr_remove(_priv->perf.metrics_idr, *arg);
+   dev_priv->perf.n_metrics--;
 
mutex_unlock(_priv->perf.metrics_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index abac5042da2b..e1f0c184a209 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -7,6 +7,7 @@
 #include 
 
 #include "i915_drv.h"
+#include "i915_perf.h"
 #include "i915_query.h"
 #include 
 
@@ -140,10 +141,291 @@ query_engine_info(struct drm_i915_private *i915,
return len;
 }
 
+static int can_copy_perf_config_registers_or_number(u32 user_n_regs,
+   u64 user_regs_ptr,
+   u32 kernel_n_regs)
+{
+   /*
+* We'll just put the number of registers, and won't copy the
+* register.
+*/
+   if (user_n_regs == 0)
+   return 0;
+
+   if (user_n_regs < kernel_n_regs)
+   return -EINVAL;
+
+   if (!access_ok(u64_to_user_ptr(user_regs_ptr),
+  2 * sizeof(u32) * kernel_n_regs))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int copy_perf_config_registers_or_number(const struct i915_oa_reg 
*kernel_regs,
+   u32 kernel_n_regs,
+   u64 user_regs_ptr,
+   u32 *user_n_regs)
+{
+   u32 r;
+
+   if (*user_n_regs == 0) {
+   *user_n_regs = kernel_n_regs;
+   return 0;
+   }
+
+   *user_n_regs = kernel_n_regs;
+
+   for (r = 0; r < kernel_n_regs; r++) {
+   u32 __user *user_reg_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2);
+   u32 __user *user_val_ptr =
+   u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 +
+   sizeof(u32));
+   int ret;
+
+   ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr),
+user_reg_ptr);
+   if (ret)
+   return -EFAULT;
+
+   ret = __put_user(kernel_regs[r].value, user_val_ptr);
+   if (ret)
+   return -EFAULT;
+   }
+
+   return 0;
+}
+
+static int query_perf_config_data(struct drm_i915_private *i915,
+ struct drm_i915_query_item *query_item,
+ bool use_uuid)

[Intel-gfx] [PATCH v16 07/13] drm/i915/perf: allow for CS OA configs to be created lazily

2019-09-09 Thread Lionel Landwerlin
Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
Flush cpu mapping of OA config (Chris)

v3: Properly deal with the perf_metric lock (Chris/Lionel)

v4: Fix oa config unref/put when not found (Lionel)

v5: Allocate BOs for configurations on the stream instead of globally
(Lionel)

v6: Fix 64bit division (Chris)

v7: Store allocated config BOs into the stream (Lionel)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
 drivers/gpu/drm/i915/i915_drv.h  |   4 +-
 drivers/gpu/drm/i915/i915_perf.c | 270 ---
 drivers/gpu/drm/i915/i915_perf.h |  26 ++
 drivers/gpu/drm/i915/i915_perf_types.h   |  15 +-
 5 files changed, 273 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index fbad403ab7ac..b6373fbc927d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -135,6 +135,7 @@
 /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */
 #define   MI_LRI_CS_MMIO   (1<<19)
 #define   MI_LRI_FORCE_POSTED  (1<<12)
+#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
 #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1)
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT(1<<22)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f4145ae6ab6e..7eb31923cde9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1363,8 +1363,8 @@ struct drm_i915_private {
struct mutex metrics_lock;
 
/*
-* List of dynamic configurations, you need to hold
-* dev_priv->perf.metrics_lock to access it.
+* List of dynamic configurations (struct i915_oa_config), you
+* need to hold dev_priv->perf.metrics_lock to access it.
 */
struct idr metrics_idr;
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 40a1ec2bc96b..93a424c4a577 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -367,11 +367,19 @@ struct perf_open_properties {
struct intel_engine_cs *engine;
 };
 
+struct i915_oa_config_bo {
+   struct list_head link;
+
+   struct i915_oa_config *oa_config;
+   struct drm_i915_gem_object *bo;
+};
+
 static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer);
 
-static void free_oa_config(struct drm_i915_private *dev_priv,
-  struct i915_oa_config *oa_config)
+void i915_oa_config_release(struct kref *ref)
 {
+   struct i915_oa_config *oa_config = container_of(ref, 
typeof(*oa_config), ref);
+
if (!PTR_ERR(oa_config->flex_regs))
kfree(oa_config->flex_regs);
if (!PTR_ERR(oa_config->b_counter_regs))
@@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private 
*dev_priv,
kfree(oa_config);
 }
 
-static void put_oa_config(struct drm_i915_private *dev_priv,
- struct i915_oa_config *oa_config)
+static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 
n_regs)
 {
-   if (!atomic_dec_and_test(_config->ref_count))
-   return;
+   u32 i;
+
+   for (i = 0; i < n_regs; i++) {
+   if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
+   u32 n_lri = min(n_regs - i,
+   (u32)MI_LOAD_REGISTER_IMM_MAX_REGS);
 
-   free_oa_config(dev_priv, oa_config);
+   *cs++ = MI_LOAD_REGISTER_IMM(n_lri);
+   }
+   *cs++ = i915_mmio_reg_offset(reg_data[i].addr);
+   *cs++ = reg_data[i].value;
+   }
+
+   return cs;
 }
 
-static int get_oa_config(struct drm_i915_private *dev_priv,
-int metrics_set,
-struct i915_oa_config **out_config)
+static struct i915_oa_config_bo *alloc_oa_config_buffer(struct 
drm_i915_private *i915,
+   struct i915_oa_config 
*oa_config)
 {
-   int ret;
+   struct i915_oa_config_bo *oa_bo;
+   size_t config_length = 0;
+   u32 *cs;
+   int err;
+
+   oa_bo = kzalloc(sizeof(*oa_bo), 

[Intel-gfx] [PATCH v16 10/13] drm/i915/perf: execute OA configuration from command stream

2019-09-09 Thread Lionel Landwerlin
We haven't run into issues with programming the global OA/NOA
registers configuration from CPU so far, but HW engineers actually
recommend doing this from the command streamer. On TGL in particular
one of the clock domain in which some of that programming goes might
not be powered when we poke things from the CPU.

Since we have a command buffer prepared for the execbuffer side of
things, we can reuse that approach here too.

This also allows us to significantly reduce the amount of time we hold
the main lock.

v2: Drop the global lock as much as possible

v3: Take global lock to pin global

v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel)

v5: Move locking to the stream (Lionel)

v6: Move active reconfiguration request into i915_perf_stream (Lionel)

v7: Pin VMA outside request creation (Chris)
Lock VMA before move to active (Chris)

v8: Fix double free on stream->initial_oa_config_bo (Lionel)
Don't allow interruption when waiting on active config request
(Lionel)

v9: Don't ignore return value from i915_active_request_retire (Lionel)

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_perf.c   | 174 -
 drivers/gpu/drm/i915/i915_perf_types.h |  15 ++-
 2 files changed, 128 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index d01494180465..929ab54ee371 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1557,18 +1557,23 @@ free_oa_configs(struct i915_perf_stream *stream)
 static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 {
struct drm_i915_private *dev_priv = stream->dev_priv;
+   int err;
 
BUG_ON(stream != dev_priv->perf.exclusive_stream);
 
-   /*
-* Unset exclusive_stream first, it will be checked while disabling
-* the metric set on gen8+.
-*/
mutex_lock(_priv->drm.struct_mutex);
-   dev_priv->perf.exclusive_stream = NULL;
+   mutex_lock(>config_mutex);
dev_priv->perf.ops.disable_metric_set(stream);
+   err = i915_active_request_retire(>active_config_rq, 0,
+>config_mutex);
+   mutex_unlock(>config_mutex);
+   dev_priv->perf.exclusive_stream = NULL;
mutex_unlock(_priv->drm.struct_mutex);
 
+   if (err)
+   DRM_ERROR("Failed to disable perf stream\n");
+
+
free_oa_buffer(stream);
free_noa_wait(stream);
 
@@ -1794,6 +1799,10 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return PTR_ERR(bo);
}
 
+   ret = i915_mutex_lock_interruptible(>drm);
+   if (ret)
+   goto err_unref;
+
/*
 * We pin in GGTT because we jump into this buffer now because
 * multiple OA config BOs will have a jump to this address and it
@@ -1801,10 +1810,13 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
 */
vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0);
if (IS_ERR(vma)) {
+   mutex_unlock(>drm.struct_mutex);
ret = PTR_ERR(vma);
goto err_unref;
}
 
+   mutex_unlock(>drm.struct_mutex);
+
batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
if (IS_ERR(batch)) {
ret = PTR_ERR(batch);
@@ -1938,7 +1950,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
return 0;
 
 err_unpin:
-   __i915_vma_unpin(vma);
+   mutex_lock(>drm.struct_mutex);
+   i915_vma_unpin_and_release(, 0);
+   mutex_unlock(>drm.struct_mutex);
 
 err_unref:
i915_gem_object_put(bo);
@@ -1946,50 +1960,73 @@ static int alloc_noa_wait(struct i915_perf_stream 
*stream)
return ret;
 }
 
-static void config_oa_regs(struct drm_i915_private *dev_priv,
-  const struct i915_oa_reg *regs,
-  u32 n_regs)
+static int emit_oa_config(struct drm_i915_private *i915,
+ struct i915_perf_stream *stream)
 {
-   u32 i;
+   struct i915_request *rq;
+   struct i915_vma *vma;
+   u32 *cs;
+   int err;
 
-   for (i = 0; i < n_regs; i++) {
-   const struct i915_oa_reg *reg = regs + i;
+   lockdep_assert_held(>config_mutex);
+
+   vma = i915_vma_instance(stream->initial_oa_config_bo,
+   >engine->gt->ggtt->vm, NULL);
+   if (unlikely(IS_ERR(vma)))
+   return PTR_ERR(vma);
+
+   err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL);
+   if (err)
+   goto err_vma_unpin;
 
-   I915_WRITE(reg->addr, reg->value);
+   rq = i915_request_create(stream->engine->kernel_context);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto err_add_request;
}
-}
 
-static void delay_after_mux(void)
-{
-   /*
-* It apparently takes a fairly long time for a new MUX
-   

[Intel-gfx] [PATCH 1/2] drm/i915: Show the logical context ring state on dumping

2019-09-09 Thread Chris Wilson
Include the active context register state when dumping the engine.

Suggested-by: Mika Kuoppala 
Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index a8014c59b388..3c176b0f4b45 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1404,6 +1404,11 @@ void intel_engine_dump(struct intel_engine_cs *engine,
   rq->timeline->hwsp_offset);
 
print_request_ring(m, rq);
+
+   if (rq->hw_context->lrc_reg_state) {
+   drm_printf(m, "Logical Ring Context:\n");
+   hexdump(m, rq->hw_context->lrc_reg_state, PAGE_SIZE);
+   }
}
spin_unlock_irqrestore(>active.lock, flags);
 
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v16 01/13] drm/i915: introduce a mechanism to extend execbuf2

2019-09-09 Thread Lionel Landwerlin
We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

v2: Check for invalid flags in execbuffer2 (Lionel)

v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris)

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson  (v1)
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++-
 include/uapi/drm/i915_drm.h   | 26 +++--
 2 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 27dbcb508055..4f5fd946ab28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
+#include "i915_user_extensions.h"
 
 enum {
FORCE_CPU_RELOC = 1,
@@ -272,6 +273,10 @@ struct i915_execbuffer {
 */
int lut_size;
struct hlist_head *buckets; /** ht for relocation handles */
+
+   struct {
+   u64 flags; /** Available extensions parameters */
+   } extensions;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct 
drm_i915_gem_execbuffer2 *exec)
return false;
 
/* Kernel clipping was a DRI1 misfeature */
-   if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+   if (!(exec->flags & (I915_EXEC_FENCE_ARRAY |
+I915_EXEC_USE_EXTENSIONS))) {
if (exec->num_cliprects || exec->cliprects_ptr)
return false;
}
@@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb,
}
 }
 
+static const i915_user_extension_fn execbuf_extensions[] = {
+};
+
+static int
+parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
+ struct i915_execbuffer *eb)
+{
+   eb->extensions.flags = 0;
+
+   if (!(args->flags & I915_EXEC_USE_EXTENSIONS))
+   return 0;
+
+   /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot
+* have another flag also using it at the same time.
+*/
+   if (eb->args->flags & I915_EXEC_FENCE_ARRAY)
+   return -EINVAL;
+
+   if (args->num_cliprects != 0)
+   return -EINVAL;
+
+   return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr),
+   execbuf_extensions,
+   ARRAY_SIZE(execbuf_extensions),
+   eb);
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
   struct drm_file *file,
@@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (args->flags & I915_EXEC_IS_PINNED)
eb.batch_flags |= I915_DISPATCH_PINNED;
 
+   err = parse_execbuf2_extensions(args, );
+   if (err)
+   return err;
+
if (args->flags & I915_EXEC_FENCE_IN) {
in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
if (!in_fence)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 469dc512cca3..0a99c26730e1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence {
__u32 flags;
 };
 
+enum drm_i915_gem_execbuffer_ext {
+   DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */
+};
+
 struct drm_i915_gem_execbuffer2 {
/**
 * List of gem_exec_object2 structs
@@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 {
__u32 num_cliprects;
/**
 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-* is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
-* struct drm_i915_gem_exec_fence *fences.
+* & I915_EXEC_USE_EXTENSIONS are not set.
+*
+* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
+* of struct drm_i915_gem_exec_fence and num_cliprects is the length
+* of the array.
+*
+* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
+* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is
+* 0.
 */
__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK  (0x3f)
@@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_SUBMIT (1 << 20)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
+/*
+ * Setting I915_EXEC_USE_EXTENSIONS implies that
+ * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked
+ * list of i915_user_extension. Each i915_user_extension node is the base of a
+ * larger structure. The list of supported structures are listed in the
+ * drm_i915_gem_execbuffer_ext enum.
+ */
+#define 

[Intel-gfx] [PATCH v16 06/13] drm/i915/perf: move perf types to their own header

2019-09-09 Thread Lionel Landwerlin
Following a pattern used throughout the driver.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h| 300 +--
 drivers/gpu/drm/i915/i915_perf.h   |   2 +
 drivers/gpu/drm/i915/i915_perf_types.h | 327 +
 3 files changed, 330 insertions(+), 299 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 274a1193d4f0..f4145ae6ab6e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -92,6 +92,7 @@
 #include "i915_gem_fence_reg.h"
 #include "i915_gem_gtt.h"
 #include "i915_gpu_error.h"
+#include "i915_perf_types.h"
 #include "i915_request.h"
 #include "i915_scheduler.h"
 #include "gt/intel_timeline.h"
@@ -979,305 +980,6 @@ struct intel_wm_config {
bool sprites_scaled;
 };
 
-struct i915_oa_format {
-   u32 format;
-   int size;
-};
-
-struct i915_oa_reg {
-   i915_reg_t addr;
-   u32 value;
-};
-
-struct i915_oa_config {
-   char uuid[UUID_STRING_LEN + 1];
-   int id;
-
-   const struct i915_oa_reg *mux_regs;
-   u32 mux_regs_len;
-   const struct i915_oa_reg *b_counter_regs;
-   u32 b_counter_regs_len;
-   const struct i915_oa_reg *flex_regs;
-   u32 flex_regs_len;
-
-   struct attribute_group sysfs_metric;
-   struct attribute *attrs[2];
-   struct device_attribute sysfs_metric_id;
-
-   atomic_t ref_count;
-};
-
-struct i915_perf_stream;
-
-/**
- * struct i915_perf_stream_ops - the OPs to support a specific stream type
- */
-struct i915_perf_stream_ops {
-   /**
-* @enable: Enables the collection of HW samples, either in response to
-* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
-* without `I915_PERF_FLAG_DISABLED`.
-*/
-   void (*enable)(struct i915_perf_stream *stream);
-
-   /**
-* @disable: Disables the collection of HW samples, either in response
-* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
-* the stream.
-*/
-   void (*disable)(struct i915_perf_stream *stream);
-
-   /**
-* @poll_wait: Call poll_wait, passing a wait queue that will be woken
-* once there is something ready to read() for the stream
-*/
-   void (*poll_wait)(struct i915_perf_stream *stream,
- struct file *file,
- poll_table *wait);
-
-   /**
-* @wait_unlocked: For handling a blocking read, wait until there is
-* something to ready to read() for the stream. E.g. wait on the same
-* wait queue that would be passed to poll_wait().
-*/
-   int (*wait_unlocked)(struct i915_perf_stream *stream);
-
-   /**
-* @read: Copy buffered metrics as records to userspace
-* **buf**: the userspace, destination buffer
-* **count**: the number of bytes to copy, requested by userspace
-* **offset**: zero at the start of the read, updated as the read
-* proceeds, it represents how many bytes have been copied so far and
-* the buffer offset for copying the next record.
-*
-* Copy as many buffered i915 perf samples and records for this stream
-* to userspace as will fit in the given buffer.
-*
-* Only write complete records; returning -%ENOSPC if there isn't room
-* for a complete record.
-*
-* Return any error condition that results in a short read such as
-* -%ENOSPC or -%EFAULT, even though these may be squashed before
-* returning to userspace.
-*/
-   int (*read)(struct i915_perf_stream *stream,
-   char __user *buf,
-   size_t count,
-   size_t *offset);
-
-   /**
-* @destroy: Cleanup any stream specific resources.
-*
-* The stream will always be disabled before this is called.
-*/
-   void (*destroy)(struct i915_perf_stream *stream);
-};
-
-/**
- * struct i915_perf_stream - state for a single open stream FD
- */
-struct i915_perf_stream {
-   /**
-* @dev_priv: i915 drm device
-*/
-   struct drm_i915_private *dev_priv;
-
-   /**
-* @wakeref: As we keep the device awake while the perf stream is
-* active, we track our runtime pm reference for later release.
-*/
-   intel_wakeref_t wakeref;
-
-   /**
-* @engine: Engine associated with this performance stream.
-*/
-   struct intel_engine_cs *engine;
-
-   /**
-* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
-* properties given when opening a stream, representing the contents
-* of a single sample as read() by userspace.
-*/
-   u32 sample_flags;
-
-   /**
-* @sample_size: Considering the configured contents of 

[Intel-gfx] [PATCH v16 12/13] drm/i915/perf: allow holding preemption on filtered ctx

2019-09-09 Thread Lionel Landwerlin
We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

v4: Just modify the port priority reporting function

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +
 drivers/gpu/drm/i915/i915_perf.c  | 30 +--
 drivers/gpu/drm/i915/i915_perf_types.h|  8 +
 include/uapi/drm/i915_drm.h   | 11 +++
 4 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ccb5ab542427..230af0f0761a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb)
if (err)
goto out;
 
+   /*
+* If the perf stream was opened with hold preemption, flag the
+* request properly so that the priority of the request is bumped once
+* it reaches the execlist ports.
+*/
+   if (eb->i915->perf.exclusive_stream->hold_preemption)
+   eb->request->flags |= I915_REQUEST_NOPREEMPT;
+
/*
 * If the config hasn't changed, skip reconfiguring the HW (this is
 * subject to a delay we want to avoid has much as possible).
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 929ab54ee371..0ffcb8d16154 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -343,6 +343,8 @@ static const struct i915_oa_format 
gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = {
  * struct perf_open_properties - for validated properties given to open a 
stream
  * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
  * @single_context: Whether a single or all gpu contexts should be monitored
+ * @hold_preemption: Whether the preemption is disabled for the filtered
+ *   context
  * @ctx_handle: A gem ctx handle for use with @single_context
  * @metrics_set: An ID for an OA unit metric set advertised via sysfs
  * @oa_format: An OA unit HW report format
@@ -357,6 +359,7 @@ struct perf_open_properties {
u32 sample_flags;
 
u64 single_context:1;
+   u64 hold_preemption:1;
u64 ctx_handle;
 
/* OA sampling state */
@@ -2631,6 +2634,8 @@ static int i915_oa_stream_init(struct i915_perf_stream 
*stream,
if (WARN_ON(stream->oa_buffer.format_size == 0))
return -EINVAL;
 
+   stream->hold_preemption = props->hold_preemption;
+
stream->oa_buffer.format =
dev_priv->perf.oa_formats[props->oa_format].format;
 
@@ -3190,6 +3195,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
}
}
 
+   if (props->hold_preemption) {
+   if (!props->single_context) {
+   DRM_DEBUG("preemption disable with no context\n");
+   ret = -EINVAL;
+   goto err;
+   }
+   privileged_op = true;
+   }
+
/*
 * On Haswell the OA unit supports clock gating off for a specific
 * context and in this mode there's no visibility of metrics for the
@@ -3204,7 +3218,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to
 * enable the OA unit by default.
 */
-   if (IS_HASWELL(dev_priv) && specific_ctx)
+   if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption)
privileged_op = false;
 
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
@@ -3214,7 +3228,7 @@ 

[Intel-gfx] [PATCH v16 11/13] drm/i915: add a new perf configuration execbuf parameter

2019-09-09 Thread Lionel Landwerlin
We want the ability to dispatch a set of command buffer to the
hardware, each with a different OA configuration. To achieve this, we
reuse a couple of fields from the execbuf2 struct (I CAN HAZ
execbuf3?) to notify what OA configuration should be used for a batch
buffer. This requires the process making the execbuf with this flag to
also own the perf fd at the time of execbuf.

v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris)
Move oa_config vma to active (Chris)

v3: Don't drop the lock for engine lookup (Chris)
Move OA config vma to active before writing the ringbuffer (Chris)

v4: Reuse i915_user_extension_fn
Serialize requests with OA config updates

v5: Check that the chained extension is only present once (Chris)
Unpin oa_vma in main path (Chris)

v6: Use BIT_ULL (Chris)

v7: Hold drm.struct_mutex when serializing the request with OA config (Chris)

v8: Remove active request from engine (Lionel)

v9: Move fetching OA configuration pass engine pinning (Lionel)
Lock VMA before moving to active (Chris)

v10: Fix leak on perf_fd (Lionel)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   4 +
 include/uapi/drm/i915_drm.h   |  39 +
 3 files changed, 188 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e488f22f53a4..ccb5ab542427 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -24,6 +24,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_perf.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
@@ -284,7 +285,12 @@ struct i915_execbuffer {
struct {
u64 flags; /** Available extensions parameters */
struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
+   struct drm_i915_gem_execbuffer_ext_perf perf_config;
} extensions;
+
+   struct file *perf_file;
+   struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is 
not needed. */
+   struct i915_vma *oa_vma;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, 
struct i915_vma *vma)
return err;
 }
 
+
+static int
+eb_get_oa_config(struct i915_execbuffer *eb)
+{
+   struct drm_i915_gem_object *oa_bo;
+   int err = 0;
+
+   eb->perf_file = NULL;
+   eb->oa_config = NULL;
+   eb->oa_vma = NULL;
+
+   if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) 
== 0)
+   return 0;
+
+   eb->perf_file = fget(eb->extensions.perf_config.perf_fd);
+   if (!eb->perf_file)
+   return -EINVAL;
+
+   err = i915_mutex_lock_interruptible(>i915->drm);
+   if (err)
+   return err;
+
+   if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream)
+   err = -EINVAL;
+
+   mutex_unlock(>i915->drm.struct_mutex);
+
+   if (err)
+   return err;
+
+   if (eb->i915->perf.exclusive_stream->engine != eb->engine)
+   return -EINVAL;
+
+   err = i915_perf_get_oa_config_and_bo(
+   eb->i915->perf.exclusive_stream,
+   eb->extensions.perf_config.oa_config,
+   >oa_config, _bo);
+   if (err)
+   return err;
+
+   eb->oa_vma = i915_vma_instance(oa_bo,
+  >engine->gt->ggtt->vm, NULL);
+   i915_gem_object_put(oa_bo);
+   if (IS_ERR(eb->oa_vma)) {
+   err = PTR_ERR(eb->oa_vma);
+   eb->oa_vma = NULL;
+   return err;
+   }
+
+   return 0;
+}
+
 static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 struct i915_vma *vma,
 unsigned int len)
@@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file 
*file)
spin_unlock(_priv->mm.lock);
 }
 
+static int eb_oa_config(struct i915_execbuffer *eb)
+{
+   struct i915_perf_stream *perf_stream;
+   int err;
+
+   if (!eb->oa_config)
+   return 0;
+
+   perf_stream = eb->perf_file->private_data;
+
+   err = mutex_lock_interruptible(_stream->config_mutex);
+   if (err)
+   return err;
+
+   err = i915_active_request_set(_stream->active_config_rq,
+ eb->request);
+   if (err)
+   goto out;
+
+   /*
+* If the config hasn't changed, skip reconfiguring the HW (this is
+* subject to a delay we want to avoid has much as possible).
+*/
+   if (eb->oa_config == perf_stream->oa_config)
+   goto out;
+
+   i915_vma_lock(eb->oa_vma);
+   

[Intel-gfx] [PATCH v16 00/13] drm/i915: Vulkan performance query support

2019-09-09 Thread Lionel Landwerlin
Hi all,

This is just a few compilation fixes only seen on CI.

Cheers,

Lionel Landwerlin (13):
  drm/i915: introduce a mechanism to extend execbuf2
  drm/i915: add syncobj timeline support
  drm/i915/perf: drop list of streams
  drm/i915/perf: store the associated engine of a stream
  drm/i915/perf: introduce a versioning of the i915-perf uapi
  drm/i915/perf: move perf types to their own header
  drm/i915/perf: allow for CS OA configs to be created lazily
  drm/i915/perf: implement active wait for noa configurations
  drm/i915: add wait flags to i915_active_request_retire
  drm/i915/perf: execute OA configuration from command stream
  drm/i915: add a new perf configuration execbuf parameter
  drm/i915/perf: allow holding preemption on filtered ctx
  drm/i915: add support for perf configuration queries

 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 501 ++--
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  25 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   5 +
 drivers/gpu/drm/i915/i915_active.c|   4 +-
 drivers/gpu/drm/i915/i915_active.h|   5 +-
 drivers/gpu/drm/i915/i915_debugfs.c   |  30 +
 drivers/gpu/drm/i915/i915_drv.c   |   3 +-
 drivers/gpu/drm/i915/i915_drv.h   | 313 +---
 drivers/gpu/drm/i915/i915_getparam.c  |   9 +
 drivers/gpu/drm/i915/i915_perf.c  | 719 +++---
 drivers/gpu/drm/i915/i915_perf.h  |  29 +
 drivers/gpu/drm/i915/i915_perf_types.h| 367 +
 drivers/gpu/drm/i915/i915_query.c | 282 +++
 drivers/gpu/drm/i915/i915_reg.h   |   4 +-
 include/uapi/drm/i915_drm.h   | 196 -
 15 files changed, 2012 insertions(+), 480 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h

--
2.23.0
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v16 02/13] drm/i915: add syncobj timeline support

2019-09-09 Thread Lionel Landwerlin
Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

v2: Reuse i915_user_extension_fn

v3: Check that the chained extension is only present once (Chris)

v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)

v5: Use BIT_ULL (Chris)

v6: Fix issue with already signaled timeline points,
dma_fence_chain_find_seqno() setting fence to NULL (Chris)

v7: Report ENOENT with invalid syncobj handle (Lionel)

v8: Check for out of order timeline point insertion (Chris)

v9: After explanations on
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
drop the ordering check from v8 (Lionel)

v10: Set first extension enum item to 1 (Jason)

Signed-off-by: Lionel Landwerlin 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++
 drivers/gpu/drm/i915/i915_drv.c   |   3 +-
 drivers/gpu/drm/i915/i915_getparam.c  |   1 +
 include/uapi/drm/i915_drm.h   |  39 +++
 4 files changed, 293 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4f5fd946ab28..e488f22f53a4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -214,6 +214,13 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
+struct i915_eb_fences {
+   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
+   struct dma_fence *dma_fence;
+   u64 value;
+   struct dma_fence_chain *chain_fence;
+};
+
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -276,6 +283,7 @@ struct i915_execbuffer {
 
struct {
u64 flags; /** Available extensions parameters */
+   struct drm_i915_gem_execbuffer_ext_timeline_fences 
timeline_fences;
} extensions;
 };
 
@@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb,
 }
 
 static void
-__free_fence_array(struct drm_syncobj **fences, unsigned int n)
+__free_fence_array(struct i915_eb_fences *fences, unsigned int n)
 {
-   while (n--)
-   drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+   while (n--) {
+   drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+   dma_fence_put(fences[n].dma_fence);
+   kfree(fences[n].chain_fence);
+   }
kvfree(fences);
 }
 
-static struct drm_syncobj **
-get_fence_array(struct drm_i915_gem_execbuffer2 *args,
-   struct drm_file *file)
+static struct i915_eb_fences *
+get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences)
+{
+   struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences =
+   >extensions.timeline_fences;
+   struct drm_i915_gem_exec_fence __user *user_fences;
+   struct i915_eb_fences *fences;
+   u64 __user *user_values;
+   u64 num_fences, num_user_fences = timeline_fences->fence_count;
+   unsigned long n;
+   int err;
+
+   /* Check multiplication overflow for access_ok() and kvmalloc_array() */
+   BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+   if (num_user_fences > min_t(unsigned long,
+   ULONG_MAX / sizeof(*user_fences),
+   SIZE_MAX / sizeof(*fences)))
+   return ERR_PTR(-EINVAL);
+
+   user_fences = u64_to_user_ptr(timeline_fences->handles_ptr);
+   if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences)))
+   return ERR_PTR(-EFAULT);
+
+   user_values = u64_to_user_ptr(timeline_fences->values_ptr);
+   if (!access_ok(user_values, num_user_fences * sizeof(*user_values)))
+   return ERR_PTR(-EFAULT);
+
+   fences = kvmalloc_array(num_user_fences, sizeof(*fences),
+   __GFP_NOWARN | GFP_KERNEL);
+   if (!fences)
+   return ERR_PTR(-ENOMEM);
+
+   BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
+
+   for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) {
+   struct drm_i915_gem_exec_fence user_fence;
+   struct drm_syncobj *syncobj;
+   struct dma_fence *fence = NULL;
+   u64 point;
+
+   if (__copy_from_user(_fence, user_fences++, 
sizeof(user_fence))) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
+   err = -EINVAL;
+   goto err;
+   }
+
+   if (__get_user(point, user_values++)) {
+   err = -EFAULT;
+   goto err;
+   }
+
+   syncobj 

[Intel-gfx] [PATCH v16 03/13] drm/i915/perf: drop list of streams

2019-09-09 Thread Lionel Landwerlin
At some point in time there was the idea that we could have multiple
stream from the same piece of HW but that never materialized and given
the hard time we already have making everything work with the
submission side, there is no real point having this list of 1 element
around.

Signed-off-by: Lionel Landwerlin 
---
 drivers/gpu/drm/i915/i915_drv.h  |  6 --
 drivers/gpu/drm/i915/i915_perf.c | 16 +---
 2 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index db7480831e52..75607450ba00 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1082,11 +1082,6 @@ struct i915_perf_stream {
 */
struct drm_i915_private *dev_priv;
 
-   /**
-* @link: Links the stream into ``_i915_private->streams``
-*/
-   struct list_head link;
-
/**
 * @wakeref: As we keep the device awake while the perf stream is
 * active, we track our runtime pm reference for later release.
@@ -1671,7 +1666,6 @@ struct drm_i915_private {
 * except exclusive_stream.
 */
struct mutex lock;
-   struct list_head streams;
 
/*
 * The stream currently using the OA unit. If accessed
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index c1b764233761..d18cd332afb7 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream 
*stream)
 */
memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE);
 
-   /*
-* Maybe make ->pollin per-stream state if we support multiple
-* concurrent streams in the future.
-*/
stream->pollin = false;
 }
 
@@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct 
i915_perf_stream *stream)
if (stream->ops->destroy)
stream->ops->destroy(stream);
 
-   list_del(>link);
-
if (stream->ctx)
i915_gem_context_put(stream->ctx);
 
@@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
goto err_flags;
}
 
-   list_add(>link, _priv->perf.streams);
-
if (param->flags & I915_PERF_FLAG_FD_CLOEXEC)
f_flags |= O_CLOEXEC;
if (param->flags & I915_PERF_FLAG_FD_NONBLOCK)
@@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags);
if (stream_fd < 0) {
ret = stream_fd;
-   goto err_open;
+   goto err_flags;
}
 
if (!(param->flags & I915_PERF_FLAG_DISABLED))
@@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private 
*dev_priv,
 
return stream_fd;
 
-err_open:
-   list_del(>link);
 err_flags:
if (stream->ops->destroy)
stream->ops->destroy(stream);
@@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
}
 
if (dev_priv->perf.ops.enable_metric_set) {
-   INIT_LIST_HEAD(_priv->perf.streams);
mutex_init(_priv->perf.lock);
 
oa_sample_rate_hard_limit = 1000 *
-- 
2.23.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/tgl: Implement Wa_1409142259

2019-09-09 Thread Daniele Ceraolo Spurio



On 9/6/19 6:10 PM, Matt Roper wrote:

On Fri, Sep 06, 2019 at 03:46:42PM -0700, Daniele Ceraolo Spurio wrote:



On 9/6/19 3:41 PM, Radhakrishna Sripada wrote:

Disable CPS aware color pipe by setting chicken bit.

BSpec: 52890
HSDES: 1409142259

Cc: Stuart Summers 
Cc: Matt Roper 
Signed-off-by: Radhakrishna Sripada 
---
   drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +
   drivers/gpu/drm/i915/i915_reg.h | 1 +
   2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 243d3f77be13..14e3f9677b06 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -894,6 +894,11 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
   static void
   tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
   {
+   wa_init_mcr(i915, wal);


this is not part of the WA you're trying to apply, right?


+
+   /* Wa_1409142259 */
+   WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3,
+ GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);


AFAICS the register is part of the render context, so shouldn't we set this
as part of the ctx_workarounds? that's what we do for another WA on the same
register on ICL.


How do you usually determine if a register is part of the context or
not?  This one doesn't have the "This Register is saved and restored as
part of Context" notation that other context registers have, so is there
somewhere else we're supposed to find that information?



Most of the context registers are not tagged that way. The golden 
reference for what's in the context is the context image page (Bspec 
46255 for TGL).


Daniele



Matt



Daniele


   }
   static void
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 006cffd56be2..53e07882efb7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7668,6 +7668,7 @@ enum {
   #define GEN11_COMMON_SLICE_CHICKEN3  _MMIO(0x7304)
 #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC (1 << 11)
+  #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE   (1 << 9)
   #define HIZ_CHICKEN  _MMIO(0x7018)
   # define CHV_HZ_8X8_MODE_IN_1X   (1 << 15)


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t] tests/kms_rotation_crc: Switch to one-shot CRC collection

2019-09-09 Thread Matt Roper
kms_rotation_crc manually starts and stops CRC collection and reads
single CRC values when it needs them.  Depending on how long the other
test setup and execution operations take, the CRC buffer (128 entries)
can fill up CRC values that the test never reads or uses.  Our CI system
has stumbled over several cases where the buffer fills up and overflows
due to this.  Let's switch this test over to the
igt_pipe_crc_collect_crc API which will handle the start+stop of CRC
collection when a single CRC is needed so that we won't collect a bunch
of unwanted CRC values and run the risk of overflow.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105127
Signed-off-by: Matt Roper 
---
 tests/kms_rotation_crc.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/tests/kms_rotation_crc.c b/tests/kms_rotation_crc.c
index 668c1732..8f36fd2f 100644
--- a/tests/kms_rotation_crc.c
+++ b/tests/kms_rotation_crc.c
@@ -167,7 +167,7 @@ static void cleanup_crtc(data_t *data)
 }
 
 static void prepare_crtc(data_t *data, igt_output_t *output, enum pipe pipe,
-igt_plane_t *plane, bool start_crc)
+igt_plane_t *plane)
 {
igt_display_t *display = >display;
 
@@ -181,9 +181,6 @@ static void prepare_crtc(data_t *data, igt_output_t 
*output, enum pipe pipe,
 
igt_display_commit2(display, COMMIT_ATOMIC);
data->pipe_crc = igt_pipe_crc_new(data->gfx_fd, pipe, 
INTEL_PIPE_CRC_SOURCE_AUTO);
-
-   if (start_crc)
-   igt_pipe_crc_start(data->pipe_crc);
 }
 
 enum rectangle_type {
@@ -263,7 +260,7 @@ static void prepare_fbs(data_t *data, igt_output_t *output,
igt_plane_set_position(plane, data->pos_x, data->pos_y);
igt_display_commit2(display, COMMIT_ATOMIC);
 
-   igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, 
>flip_crc);
+   igt_pipe_crc_collect_crc(data->pipe_crc, >flip_crc);
 
/*
  * Prepare the non-rotated flip fb.
@@ -286,7 +283,7 @@ static void prepare_fbs(data_t *data, igt_output_t *output,
igt_plane_set_position(plane, data->pos_x, data->pos_y);
igt_display_commit2(display, COMMIT_ATOMIC);
 
-   igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, 
>ref_crc);
+   igt_pipe_crc_collect_crc(data->pipe_crc, >ref_crc);
 
/*
 * Prepare the non-rotated reference fb.
@@ -336,7 +333,7 @@ static void test_single_case(data_t *data, enum pipe pipe,
igt_assert_eq(ret, 0);
 
/* Check CRC */
-   igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, _output);
+   igt_pipe_crc_collect_crc(data->pipe_crc, _output);
igt_assert_crc_equal(>ref_crc, _output);
 
/*
@@ -359,7 +356,7 @@ static void test_single_case(data_t *data, enum pipe pipe,
igt_assert_eq(ret, 0);
}
kmstest_wait_for_pageflip(data->gfx_fd);
-   igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, 
_output);
+   igt_pipe_crc_collect_crc(data->pipe_crc, _output);
igt_assert_crc_equal(>flip_crc,
 _output);
}
@@ -388,7 +385,7 @@ static void test_plane_rotation(data_t *data, int 
plane_type, bool test_bad_form
plane = igt_output_get_plane_type(output, plane_type);
igt_require(igt_plane_has_prop(plane, IGT_PLANE_ROTATION));
 
-   prepare_crtc(data, output, pipe, plane, true);
+   prepare_crtc(data, output, pipe, plane);
 
for (i = 0; i < num_rectangle_types; i++) {
/* Unsupported on i915 */
@@ -416,7 +413,6 @@ static void test_plane_rotation(data_t *data, int 
plane_type, bool test_bad_form
 data->override_fmt, 
test_bad_format);
}
}
-   igt_pipe_crc_stop(data->pipe_crc);
}
 }
 
@@ -473,7 +469,7 @@ static bool get_multiplane_crc(data_t *data, igt_output_t 
*output,
ret = igt_display_try_commit2(display, COMMIT_ATOMIC);
igt_assert_eq(ret, 0);
 
-   igt_pipe_crc_get_current(data->gfx_fd, data->pipe_crc, crc_output);
+   igt_pipe_crc_collect_crc(data->pipe_crc, crc_output);
 
for (c = 0; c < numplanes && oldplanes; c++)
igt_remove_fb(data->gfx_fd, [c].fb);
@@ -564,7 +560,6 @@ static void test_multi_plane_rotation(data_t *data, enum 
pipe pipe)
 
data->pipe_crc = igt_pipe_crc_new(data->gfx_fd, pipe,
  INTEL_PIPE_CRC_SOURCE_AUTO);
-   igt_pipe_crc_start(data->pipe_crc);
 
for (i = 0; i < ARRAY_SIZE(planeconfigs); i++) {
p[0].planetype = DRM_PLANE_TYPE_PRIMARY;
@@ -620,7 +615,6 @@ static void test_multi_plane_rotation(data_t *data, enum 
pipe pipe)
}
}
  

Re: [Intel-gfx] [PULL] gvt-next-fixes

2019-09-09 Thread Rodrigo Vivi
Hi guys,

On Fri, Sep 06, 2019 at 01:42:55PM +0800, Zhenyu Wang wrote:
> 
> Hi,
> 
> Here's gvt-next-fixes with two recent fixes, one for recent
> guest hang regression and another for guest reset fix.
> 
> Thanks.
> --
> The following changes since commit c36beba6b296b3c05a0f29753b04775e5ae23886:
> 
>   drm/i915: Seal races between async GPU cancellation, retirement and 
> signaling (2019-05-13 13:53:35 +0300)
> 
> are available in the Git repository at:
> 
>   https://github.com/intel/gvt-linux.git tags/gvt-next-fixes-2019-09-06
> 
> for you to fetch changes up to 4a5322560aa235efa84c0aa34c00e5749a0792fd:
> 
>   drm/i915/gvt: update RING_START reg of vGPU when the context is submitted 
> to i915 (2019-09-06 13:39:09 +0800)


$ dim pull-request-next-fixes
Using drm/drm-next as the upstream
dim: 4a5322560aa2 ("drm/i915/gvt: update RING_START reg of vGPU when the 
context is submitted to i915"): Link tag missing.
dim: 0a3242bdb477 ("drm/i915/gvt: update vgpu workload head pointer 
correctly"): Link tag missing.
dim: ERROR: issues in commits detected, aborting

I wonder how I should proceed here. In the past I was always bypasssing dim,
but now that drm maintainers also use dim I'm sure this will blow up
there anyways.

But gvt patches are not tracked on our CI individually hence they don't
have Links.

Jani, Joonas, how are you guys handling this?

Daniel, Dave, ideas?

Thanks,
Rodrigo.

> 
> 
> gvt-next-fixes-2019-09-06
> 
> - Fix guest context head pointer update for hang (Xiaolin)
> - Fix guest context ring state for reset (Weinan)
> 
> 
> Weinan Li (1):
>   drm/i915/gvt: update RING_START reg of vGPU when the context is 
> submitted to i915
> 
> Xiaolin Zhang (1):
>   drm/i915/gvt: update vgpu workload head pointer correctly
> 
>  drivers/gpu/drm/i915/gvt/scheduler.c | 45 
> +---
>  1 file changed, 32 insertions(+), 13 deletions(-)
> 
> 
> -- 
> Open Source Technology Center, Intel ltd.
> 
> $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/ringbuffer: Flush writes before RING_TAIL update

2019-09-09 Thread Patchwork
== Series Details ==

Series: drm/i915/ringbuffer: Flush writes before RING_TAIL update
URL   : https://patchwork.freedesktop.org/series/66426/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14327


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_14327:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_sync@basic-each:
- {fi-tgl-u}: NOTRUN -> [INCOMPLETE][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-tgl-u/igt@gem_s...@basic-each.html

  
Known issues


  Here are the changes found in Patchwork_14327 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live_gem_contexts:
- fi-skl-iommu:   [PASS][2] -> [INCOMPLETE][3] ([fdo#111519])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-skl-iommu/igt@i915_selftest@live_gem_contexts.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-skl-iommu/igt@i915_selftest@live_gem_contexts.html

  
 Possible fixes 

  * igt@gem_ctx_switch@legacy-render:
- fi-icl-u2:  [INCOMPLETE][4] ([fdo#107713] / [fdo#111381]) -> 
[PASS][5]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html

  * igt@gem_exec_gttfill@basic:
- {fi-tgl-u}: [INCOMPLETE][6] ([fdo#111593]) -> [PASS][7]
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-tgl-u/igt@gem_exec_gttf...@basic.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-tgl-u/igt@gem_exec_gttf...@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381
  [fdo#111519]: https://bugs.freedesktop.org/show_bug.cgi?id=111519
  [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593


Participating hosts (51 -> 47)
--

  Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 
  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y 
fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6854 -> Patchwork_14327

  CI-20190529: 20190529
  CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14327: cda3b809297cb7c5b44e6c9abe22cc4b7516a98d @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

cda3b809297c drm/i915/ringbuffer: Flush writes before RING_TAIL update

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [1/6] drm/i915/selftests: Take runtime wakeref for 
igt_ggtt_lowlevel
URL   : https://patchwork.freedesktop.org/series/66425/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14326


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_14326 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14326, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_14326:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_gttfill@basic:
- fi-apl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-apl-guc/igt@gem_exec_gttf...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/fi-apl-guc/igt@gem_exec_gttf...@basic.html

  
Known issues


  Here are the changes found in Patchwork_14326 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@gem_ctx_switch@legacy-render:
- fi-icl-u2:  [INCOMPLETE][3] ([fdo#107713] / [fdo#111381]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381


Participating hosts (51 -> 46)
--

  Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 
  Missing(8): fi-ilk-m540 fi-tgl-u fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6854 -> Patchwork_14326

  CI-20190529: 20190529
  CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14326: c0df1c601b3cffed51bfebf09ddeeea08ff26fb2 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c0df1c601b3c iommu/intel: Ignore igfx_off
0462ef926b65 iommu/intel: Declare Broadwell igfx dmar support snafu
186afc9aaa54 drm/i915: Force compilation with intel-iommu for CI validation
6be90a9b332d drm/i915: Perform GGTT restore much earlier during resume
e54488ac4cd7 drm/i915/selftests: Tighten the timeout testing for partial mmaps
7386b0ebc23c drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] USB-C

2019-09-09 Thread nnet
Hello,

5.2.13 is working fine (great) still with a Dell U4919DW connected via USB-C 
from a X1 Carbon Gen 6.

5.3-rc8 so far is not (blank screen) and errors:

https://pastebin.com/tXFi6AfK

Seems there has been some refactoring for just this kind of connection in 5.3?

Is there perhaps and issue since for this scenario or are other components at 
fault perhaps (ACPI / mutter)?

Thanks!
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for adding gamma state checker for CHV and i965 (rev2)

2019-09-09 Thread Patchwork
== Series Details ==

Series: adding gamma state checker for CHV and i965 (rev2)
URL   : https://patchwork.freedesktop.org/series/66297/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
d02e018c92fe drm/i915/display: Add gamma precision function for CHV
73751624de48 drm/i915/display: Extract i965_read_luts()
-:22: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#22: 
 -Renamed i965_read_gamma_lut_10p6() to i965_read_lut_10p6() [Ville, Uma]

total: 0 errors, 1 warnings, 0 checks, 78 lines checked
1eeae34e288a drm/i915/display: Extract chv_read_luts()
-:57: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#57: FILE: drivers/gpu/drm/i915/display/intel_color.c:1642:
+   blob_data[i].green = intel_color_lut_pack(REG_FIELD_GET(

-:59: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#59: FILE: drivers/gpu/drm/i915/display/intel_color.c:1644:
+   blob_data[i].blue = intel_color_lut_pack(REG_FIELD_GET(

-:63: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#63: FILE: drivers/gpu/drm/i915/display/intel_color.c:1648:
+   blob_data[i].red = intel_color_lut_pack(REG_FIELD_GET(

total: 0 errors, 0 warnings, 3 checks, 64 lines checked

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] USB-C

2019-09-09 Thread nnet
Searching on the kernel warnings and errors didn't bring it up, but in browsing 
bugzilla I stumbled on this.

Bug 111501 - [CFL] Kernel 5.3.0-rc6: i915 fails at typec_displayport 5120x1440 
https://bugs.freedesktop.org/show_bug.cgi?id=111501

It's the same monitor and connection type.

Is the related patchset intended for 5.3 then? 
https://patchwork.freedesktop.org/series/66286/

Thanks

On Mon, Sep 9, 2019, at 10:06 AM, nnet wrote:
> Hello,
> 
> 5.2.13 is working fine (great) still with a Dell U4919DW connected via 
> USB-C from a X1 Carbon Gen 6.
> 
> 5.3-rc8 so far is not (blank screen) and errors:
> 
> https://pastebin.com/tXFi6AfK
> 
> Seems there has been some refactoring for just this kind of connection in 5.3?
> 
> Is there perhaps and issue since for this scenario or are other 
> components at fault perhaps (ACPI / mutter)?
> 
> Thanks!
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: export color_differs

2019-09-09 Thread Patchwork
== Series Details ==

Series: series starting with [1/3] drm/i915: export color_differs
URL   : https://patchwork.freedesktop.org/series/66433/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14329


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/

Known issues


  Here are the changes found in Patchwork_14329 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s4-devices:
- fi-blb-e6850:   [PASS][1] -> [INCOMPLETE][2] ([fdo#107718])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-blb-e6850/igt@gem_exec_susp...@basic-s4-devices.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/fi-blb-e6850/igt@gem_exec_susp...@basic-s4-devices.html

  
 Possible fixes 

  * igt@gem_ctx_switch@legacy-render:
- fi-icl-u2:  [INCOMPLETE][3] ([fdo#107713] / [fdo#111381]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html

  
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381


Participating hosts (51 -> 47)
--

  Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 
  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y 
fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6854 -> Patchwork_14329

  CI-20190529: 20190529
  CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14329: 1a4449483168a24176fca0b460c1d6bb69ed1121 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

1a4449483168 drm/i915: cleanup cache-coloring
5169d6ad3a55 drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
1480a89b6d8b drm/i915: export color_differs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/6] drm/i915: Force compilation with intel-iommu for CI validation

2019-09-09 Thread kbuild test robot
Hi Chris,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-selftests-Take-runtime-wakeref-for-igt_ggtt_lowlevel/20190909-201355
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-s2-201936 (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All error/warnings (new ones prefixed by >>):

   drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_coherency':
>> drivers/iommu/intel-iommu.c:622:2: error: implicit declaration of function 
>> 'for_each_active_iommu'; did you mean 'for_each_active_irq'? 
>> [-Werror=implicit-function-declaration]
 for_each_active_iommu(iommu, drhd) {
 ^
 for_each_active_irq
>> drivers/iommu/intel-iommu.c:622:37: error: expected ';' before '{' token
 for_each_active_iommu(iommu, drhd) {
^
   drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_snooping':
   drivers/iommu/intel-iommu.c:638:37: error: expected ';' before '{' token
 for_each_active_iommu(iommu, drhd) {
^
   drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_superpage':
   drivers/iommu/intel-iommu.c:663:37: error: expected ';' before '{' token
 for_each_active_iommu(iommu, drhd) {
^
   drivers/iommu/intel-iommu.c: In function 'device_to_iommu':
   drivers/iommu/intel-iommu.c:781:37: error: expected ';' before '{' token
 for_each_active_iommu(iommu, drhd) {
^
   drivers/iommu/intel-iommu.c:812:2: warning: label 'out' defined but not used 
[-Wunused-label]
 out:
 ^~~
   drivers/iommu/intel-iommu.c:756:6: warning: unused variable 'i' 
[-Wunused-variable]
 int i;
 ^
   drivers/iommu/intel-iommu.c:753:17: warning: unused variable 'tmp' 
[-Wunused-variable]
 struct device *tmp;
^~~
   drivers/iommu/intel-iommu.c: In function 'si_domain_init':
>> drivers/iommu/intel-iommu.c:2731:3: error: implicit declaration of function 
>> 'for_each_active_dev_scope'; did you mean 'for_each_active_irq'? 
>> [-Werror=implicit-function-declaration]
  for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
  ^
  for_each_active_irq
   drivers/iommu/intel-iommu.c:2732:16: error: expected ';' before '{' token
   i, dev) {
   ^
   drivers/iommu/intel-iommu.c: In function 'device_has_rmrr':
>> drivers/iommu/intel-iommu.c:2794:4: error: expected ';' before 'if'
   if (tmp == dev ||
   ^~
   drivers/iommu/intel-iommu.c: In function 'init_dmars':
>> drivers/iommu/intel-iommu.c:3157:2: error: implicit declaration of function 
>> 'for_each_drhd_unit'; did you mean 'for_each_rmrr_units'? 
>> [-Werror=implicit-function-declaration]
 for_each_drhd_unit(drhd) {
 ^~
 for_each_rmrr_units
   drivers/iommu/intel-iommu.c:3157:27: error: expected ';' before '{' token
 for_each_drhd_unit(drhd) {
  ^
   drivers/iommu/intel-iommu.c:3182:2: error: implicit declaration of function 
'for_each_iommu'; did you mean 'for_each_cpu'? 
[-Werror=implicit-function-declaration]
 for_each_iommu(iommu, drhd) {
 ^~
 for_each_cpu
   drivers/iommu/intel-iommu.c:3182:30: error: expected ';' before '{' token
 for_each_iommu(iommu, drhd) {
 ^
   drivers/iommu/intel-iommu.c:3293:30: error: expected ';' before '{' token
 for_each_iommu(iommu, drhd) {
 ^
   drivers/iommu/intel-iommu.c:3327:37: error: expected ';' before '{' token
 for_each_active_iommu(iommu, drhd) {
^
   drivers/iommu/intel-iommu.c: In function 'get_private_domain_for_dev':
   drivers/iommu/intel-iommu.c:3391:18: error: expected ';' before '{' token
   i, i_dev) {
 ^
   drivers/iommu/intel-iommu.c:3376:9: warning: unused variable 'ret' 
[-Wunused-variable]
 int i, ret;
^~~
   In file included from arch/x86/include/asm/bug.h:83:0,
from include/linux/bug.h:5,
from include/linux/jump_label.h:250,
from arch/x86/include/asm/string_64.h:6,
from arch/x86/include/asm/string.h:5,
from include/linux/string.h:20,
from include/linux/bitmap.h:9,
fr

  1   2   >