Re: [Intel-gfx] [PATCH v15 10/13] drm/i915/perf: execute OA configuration from command stream
Hi Lionel, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [cannot apply to v5.3-rc8 next-20190904] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190907-052009 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-f004-201936 (attached as .config) compiler: gcc-7 (Debian 7.4.0-11) 7.4.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): drivers/gpu/drm/i915/i915_perf.c: In function 'i915_oa_stream_init': >> drivers/gpu/drm/i915/i915_perf.c:2697:3: error: ignoring return value of >> 'i915_active_request_retire', declared with attribute warn_unused_result >> [-Werror=unused-result] i915_active_request_retire(>active_config_rq, 0, ^~~~ >config_mutex); ~~ cc1: all warnings being treated as errors vim +/i915_active_request_retire +2697 drivers/gpu/drm/i915/i915_perf.c 2556 2557 /** 2558 * i915_oa_stream_init - validate combined props for OA stream and init 2559 * @stream: An i915 perf stream 2560 * @param: The open parameters passed to `DRM_I915_PERF_OPEN` 2561 * @props: The property state that configures stream (individually validated) 2562 * 2563 * While read_properties_unlocked() validates properties in isolation it 2564 * doesn't ensure that the combination necessarily makes sense. 2565 * 2566 * At this point it has been determined that userspace wants a stream of 2567 * OA metrics, but still we need to further validate the combined 2568 * properties are OK. 2569 * 2570 * If the configuration makes sense then we can allocate memory for 2571 * a circular OA buffer and apply the requested metric set configuration. 2572 * 2573 * Returns: zero on success or a negative error code. 2574 */ 2575 static int i915_oa_stream_init(struct i915_perf_stream *stream, 2576 struct drm_i915_perf_open_param *param, 2577 struct perf_open_properties *props) 2578 { 2579 struct drm_i915_private *dev_priv = stream->dev_priv; 2580 int format_size; 2581 int ret; 2582 2583 /* If the sysfs metrics/ directory wasn't registered for some 2584 * reason then don't let userspace try their luck with config 2585 * IDs 2586 */ 2587 if (!dev_priv->perf.metrics_kobj) { 2588 DRM_DEBUG("OA metrics weren't advertised via sysfs\n"); 2589 return -EINVAL; 2590 } 2591 2592 if (!(props->sample_flags & SAMPLE_OA_REPORT)) { 2593 DRM_DEBUG("Only OA report sampling supported\n"); 2594 return -EINVAL; 2595 } 2596 2597 if (!dev_priv->perf.ops.enable_metric_set) { 2598 DRM_DEBUG("OA unit not supported\n"); 2599 return -ENODEV; 2600 } 2601 2602 /* To avoid the complexity of having to accurately filter 2603 * counter reports and marshal to the appropriate client 2604 * we currently only allow exclusive access 2605 */ 2606 if (dev_priv->perf.exclusive_stream) { 2607 DRM_DEBUG("OA unit already in use\n"); 2608 return -EBUSY; 2609 } 2610 2611 if (!props->oa_format) { 2612 DRM_DEBUG("OA report format not specified\n"); 2613 return -EINVAL; 2614 } 2615 2616 mutex_init(>config_mutex); 2617 2618 stream->sample_size = sizeof(struct drm_i915_perf_record_header); 2619 2620 format_size = dev_priv->perf.oa_formats[props->oa_format].size; 2621 2622 stream->engine = props->engine; 2623 2624 INIT_ACTIVE_REQUEST(>active_config_rq, 2625 >config_mutex); 2626 2627 stream->sample_flags |= SAMPLE_OA_REPORT; 2628 stream->sample_size += format_size; 2629 2630 stream->oa_buffer.format_size = format_size; 2631 if (WARN_ON(stream->oa_buffer.format_size == 0)) 2632 return -EINVAL; 2633 2634 stream->oa_buffer.format = 2635 dev_priv->perf.oa_formats[props->oa_format].format; 2636 2637 stream->periodic = props->oa_periodic; 2638 if (stream->periodic) 2639 stream->period_exponent = props->oa_period_exponent; 2640 2641
Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting
On 09/09/2019 10:23, Chris Wilson wrote: Quoting Tvrtko Ursulin (2019-09-09 10:19:08) On 09/09/2019 08:12, Chris Wilson wrote: And give up if we never even make it to the start. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- tests/perf_pmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index d392a67d4..8a06e5d44 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) while (!igt_spin_has_started(spin)) { unsigned long t = igt_nsec_elapsed(); + igt_assert(gem_bo_busy(fd, spin->handle)); if ((t - timeout) > 250e6) { timeout = t; igt_warn("Spinner not running after %.2fms\n", (double)t / 1e6); > + igt_assert(t < 2e9); } } } else { @@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) usleep(500e3); /* Better than nothing! */ } + igt_assert(gem_bo_busy(fd, spin->handle)); return igt_nsec_elapsed(); } The 2s timeout for batch to start executing sounds okay. I'd pull up and consolidate the bo_busy checks into one at the top of the function, since it is only telling us batch has been submitted. Or you are thinking the second check brings value in checking batch is still executing, hasn't failed or something? The thinking is to catch if we terminate the batch via hangcheck before writing the dword. I think there's value in knowing if we are slow vs dead. Yeah as guessed then - agreed. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting
On 09/09/2019 08:12, Chris Wilson wrote: And give up if we never even make it to the start. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- tests/perf_pmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index d392a67d4..8a06e5d44 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) while (!igt_spin_has_started(spin)) { unsigned long t = igt_nsec_elapsed(); + igt_assert(gem_bo_busy(fd, spin->handle)); if ((t - timeout) > 250e6) { timeout = t; igt_warn("Spinner not running after %.2fms\n", (double)t / 1e6); > + igt_assert(t < 2e9); } } } else { @@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) usleep(500e3); /* Better than nothing! */ } + igt_assert(gem_bo_busy(fd, spin->handle)); return igt_nsec_elapsed(); } The 2s timeout for batch to start executing sounds okay. I'd pull up and consolidate the bo_busy checks into one at the top of the function, since it is only telling us batch has been submitted. Or you are thinking the second check brings value in checking batch is still executing, hasn't failed or something? Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 04/13] drm/i915/perf: store the associated engine of a stream
We'll use this information later to verify that a client trying to reconfigure the stream does so on the right engine. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_perf.c | 7 +++ 2 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 75607450ba00..274a1193d4f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1088,6 +1088,11 @@ struct i915_perf_stream { */ intel_wakeref_t wakeref; + /** +* @engine: Engine associated with this performance stream. +*/ + struct intel_engine_cs *engine; + /** * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` * properties given when opening a stream, representing the contents diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d18cd332afb7..9d5a3522aa35 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -363,6 +363,8 @@ struct perf_open_properties { int oa_format; bool oa_periodic; int oa_period_exponent; + + struct intel_engine_cs *engine; }; static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); @@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, format_size = dev_priv->perf.oa_formats[props->oa_format].size; + stream->engine = props->engine; + stream->sample_flags |= SAMPLE_OA_REPORT; stream->sample_size += format_size; @@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, return -EINVAL; } + /* At the moment we only support using i915-perf on the RCS. */ + props->engine = dev_priv->engine[RCS0]; + /* Considering that ID = 0 is reserved and assuming that we don't * (currently) expect any configurations to ever specify duplicate * values for a particular property ID then the last _PROP_MAX value is -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 07/13] drm/i915/perf: allow for CS OA configs to be created lazily
Here we introduce a mechanism by which the execbuf part of the i915 driver will be able to request that a batch buffer containing the programming for a particular OA config be created. We'll execute these OA configuration buffers right before executing a set of userspace commands so that a particular user batchbuffer be executed with a given OA configuration. This mechanism essentially allows the userspace driver to go through several OA configuration without having to open/close the i915/perf stream. v2: No need for locking on object OA config object creation (Chris) Flush cpu mapping of OA config (Chris) v3: Properly deal with the perf_metric lock (Chris/Lionel) v4: Fix oa config unref/put when not found (Lionel) v5: Allocate BOs for configurations on the stream instead of globally (Lionel) v6: Fix 64bit division (Chris) v7: Store allocated config BOs into the stream (Lionel) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 + drivers/gpu/drm/i915/i915_drv.h | 4 +- drivers/gpu/drm/i915/i915_perf.c | 270 --- drivers/gpu/drm/i915/i915_perf.h | 26 ++ drivers/gpu/drm/i915/i915_perf_types.h | 15 +- 5 files changed, 273 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index fbad403ab7ac..b6373fbc927d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -135,6 +135,7 @@ /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ #define MI_LRI_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126) #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1) #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) #define MI_SRM_LRM_GLOBAL_GTT(1<<22) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f4145ae6ab6e..7eb31923cde9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1363,8 +1363,8 @@ struct drm_i915_private { struct mutex metrics_lock; /* -* List of dynamic configurations, you need to hold -* dev_priv->perf.metrics_lock to access it. +* List of dynamic configurations (struct i915_oa_config), you +* need to hold dev_priv->perf.metrics_lock to access it. */ struct idr metrics_idr; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 40a1ec2bc96b..c9d0de3050fb 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -367,11 +367,19 @@ struct perf_open_properties { struct intel_engine_cs *engine; }; +struct i915_oa_config_bo { + struct list_head link; + + struct i915_oa_config *oa_config; + struct drm_i915_gem_object *bo; +}; + static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); -static void free_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +void i915_oa_config_release(struct kref *ref) { + struct i915_oa_config *oa_config = container_of(ref, typeof(*oa_config), ref); + if (!PTR_ERR(oa_config->flex_regs)) kfree(oa_config->flex_regs); if (!PTR_ERR(oa_config->b_counter_regs)) @@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private *dev_priv, kfree(oa_config); } -static void put_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs) { - if (!atomic_dec_and_test(_config->ref_count)) - return; + u32 i; + + for (i = 0; i < n_regs; i++) { + if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) { + u32 n_lri = min(n_regs - i, + (u32) MI_LOAD_REGISTER_IMM_MAX_REGS); - free_oa_config(dev_priv, oa_config); + *cs++ = MI_LOAD_REGISTER_IMM(n_lri); + } + *cs++ = i915_mmio_reg_offset(reg_data[i].addr); + *cs++ = reg_data[i].value; + } + + return cs; } -static int get_oa_config(struct drm_i915_private *dev_priv, -int metrics_set, -struct i915_oa_config **out_config) +static struct i915_oa_config_bo* alloc_oa_config_buffer(struct drm_i915_private *i915, + struct i915_oa_config *oa_config) { - int ret; + struct i915_oa_config_bo *oa_bo; + size_t config_length = 0; + u32 *cs; + int err; + + oa_bo = kzalloc(sizeof(*oa_bo),
[Intel-gfx] [CI 02/13] drm/i915: add syncobj timeline support
Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++ drivers/gpu/drm/i915/i915_drv.c | 3 +- drivers/gpu/drm/i915/i915_getparam.c | 1 + include/uapi/drm/i915_drm.h | 39 +++ 4 files changed, 293 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 4f5fd946ab28..46ad8d9642d1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -214,6 +214,13 @@ enum { * the batchbuffer in trusted mode, otherwise the ioctl is rejected. */ +struct i915_eb_fences { + struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */ + struct dma_fence *dma_fence; + u64 value; + struct dma_fence_chain *chain_fence; +}; + struct i915_execbuffer { struct drm_i915_private *i915; /** i915 backpointer */ struct drm_file *file; /** per-file lookup tables and limits */ @@ -276,6 +283,7 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ + struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; } extensions; }; @@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb, } static void -__free_fence_array(struct drm_syncobj **fences, unsigned int n) +__free_fence_array(struct i915_eb_fences *fences, unsigned int n) { - while (n--) - drm_syncobj_put(ptr_mask_bits(fences[n], 2)); + while (n--) { + drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2)); + dma_fence_put(fences[n].dma_fence); + kfree(fences[n].chain_fence); + } kvfree(fences); } -static struct drm_syncobj ** -get_fence_array(struct drm_i915_gem_execbuffer2 *args, - struct drm_file *file) +static struct i915_eb_fences * +get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences) +{ + struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences = + >extensions.timeline_fences; + struct drm_i915_gem_exec_fence __user *user_fences; + struct i915_eb_fences *fences; + u64 __user *user_values; + u64 num_fences, num_user_fences = timeline_fences->fence_count; + unsigned long n; + int err; + + /* Check multiplication overflow for access_ok() and kvmalloc_array() */ + BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long)); + if (num_user_fences > min_t(unsigned long, + ULONG_MAX / sizeof(*user_fences), + SIZE_MAX / sizeof(*fences))) + return ERR_PTR(-EINVAL); + + user_fences = u64_to_user_ptr(timeline_fences->handles_ptr); + if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences))) + return ERR_PTR(-EFAULT); + + user_values = u64_to_user_ptr(timeline_fences->values_ptr); + if (!access_ok(user_values, num_user_fences * sizeof(*user_values))) + return ERR_PTR(-EFAULT); + + fences = kvmalloc_array(num_user_fences, sizeof(*fences), + __GFP_NOWARN | GFP_KERNEL); + if (!fences) + return ERR_PTR(-ENOMEM); + + BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) & +~__I915_EXEC_FENCE_UNKNOWN_FLAGS); + + for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) { + struct drm_i915_gem_exec_fence user_fence; + struct drm_syncobj *syncobj; + struct dma_fence *fence = NULL; + u64 point; + + if (__copy_from_user(_fence, user_fences++, sizeof(user_fence))) { + err = -EFAULT; + goto err; + } + + if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) { + err = -EINVAL; + goto err; + } + + if (__get_user(point, user_values++)) { + err = -EFAULT; + goto err; + } + + syncobj
[Intel-gfx] [CI 09/13] drm/i915: add wait flags to i915_active_request_retire
An upcoming change needs not to be interrupted. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_active.c | 4 +++- drivers/gpu/drm/i915/i915_active.h | 5 ++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 6a447f1d0110..c808c28c9464 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref) break; } - err = i915_active_request_retire(>base, BKL(ref)); + err = i915_active_request_retire(>base, +I915_WAIT_INTERRUPTIBLE, +BKL(ref)); if (err) break; } diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index f95058f99057..35a6089b44fd 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request *active) */ static inline int __must_check i915_active_request_retire(struct i915_active_request *active, + unsigned int flags, struct mutex *mutex) { struct i915_request *request; @@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request *active, if (!request) return 0; - ret = i915_request_wait(request, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); + ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT); if (ret < 0) return ret; -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 10/13] drm/i915/perf: execute OA configuration from command stream
We haven't run into issues with programming the global OA/NOA registers configuration from CPU so far, but HW engineers actually recommend doing this from the command streamer. On TGL in particular one of the clock domain in which some of that programming goes might not be powered when we poke things from the CPU. Since we have a command buffer prepared for the execbuffer side of things, we can reuse that approach here too. This also allows us to significantly reduce the amount of time we hold the main lock. v2: Drop the global lock as much as possible v3: Take global lock to pin global v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel) v5: Move locking to the stream (Lionel) v6: Move active reconfiguration request into i915_perf_stream (Lionel) v7: Pin VMA outside request creation (Chris) Lock VMA before move to active (Chris) v8: Fix double free on stream->initial_oa_config_bo (Lionel) Don't allow interruption when waiting on active config request (Lionel) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 174 - drivers/gpu/drm/i915/i915_perf_types.h | 15 ++- 2 files changed, 128 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index f2b778d84b52..abbcf3ec654c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1558,18 +1558,23 @@ free_oa_configs(struct i915_perf_stream *stream) static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; + int err; BUG_ON(stream != dev_priv->perf.exclusive_stream); - /* -* Unset exclusive_stream first, it will be checked while disabling -* the metric set on gen8+. -*/ mutex_lock(_priv->drm.struct_mutex); - dev_priv->perf.exclusive_stream = NULL; + mutex_lock(>config_mutex); dev_priv->perf.ops.disable_metric_set(stream); + err = i915_active_request_retire(>active_config_rq, 0, +>config_mutex); + mutex_unlock(>config_mutex); + dev_priv->perf.exclusive_stream = NULL; mutex_unlock(_priv->drm.struct_mutex); + if (err) + DRM_ERROR("Failed to disable perf stream\n"); + + free_oa_buffer(stream); free_noa_wait(stream); @@ -1795,6 +1800,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return PTR_ERR(bo); } + ret = i915_mutex_lock_interruptible(>drm); + if (ret) + goto err_unref; + /* * We pin in GGTT because we jump into this buffer now because * multiple OA config BOs will have a jump to this address and it @@ -1802,10 +1811,13 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0); if (IS_ERR(vma)) { + mutex_unlock(>drm.struct_mutex); ret = PTR_ERR(vma); goto err_unref; } + mutex_unlock(>drm.struct_mutex); + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); if (IS_ERR(batch)) { ret = PTR_ERR(batch); @@ -1939,7 +1951,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return 0; err_unpin: - __i915_vma_unpin(vma); + mutex_lock(>drm.struct_mutex); + i915_vma_unpin_and_release(, 0); + mutex_unlock(>drm.struct_mutex); err_unref: i915_gem_object_put(bo); @@ -1947,50 +1961,73 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return ret; } -static void config_oa_regs(struct drm_i915_private *dev_priv, - const struct i915_oa_reg *regs, - u32 n_regs) +static int emit_oa_config(struct drm_i915_private *i915, + struct i915_perf_stream *stream) { - u32 i; + struct i915_request *rq; + struct i915_vma *vma; + u32 *cs; + int err; - for (i = 0; i < n_regs; i++) { - const struct i915_oa_reg *reg = regs + i; + lockdep_assert_held(>config_mutex); + + vma = i915_vma_instance(stream->initial_oa_config_bo, + >engine->gt->ggtt->vm, NULL); + if (unlikely(IS_ERR(vma))) + return PTR_ERR(vma); + + err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); + if (err) + goto err_vma_unpin; - I915_WRITE(reg->addr, reg->value); + rq = i915_request_create(stream->engine->kernel_context); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_add_request; } -} -static void delay_after_mux(void) -{ - /* -* It apparently takes a fairly long time for a new MUX -* configuration to be be applied after these register writes. -
[Intel-gfx] [CI 01/13] drm/i915: introduce a mechanism to extend execbuf2
We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v1) --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++- include/uapi/drm/i915_drm.h | 26 +++-- 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 27dbcb508055..4f5fd946ab28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -25,6 +25,7 @@ #include "i915_gem_context.h" #include "i915_gem_ioctls.h" #include "i915_trace.h" +#include "i915_user_extensions.h" enum { FORCE_CPU_RELOC = 1, @@ -272,6 +273,10 @@ struct i915_execbuffer { */ int lut_size; struct hlist_head *buckets; /** ht for relocation handles */ + + struct { + u64 flags; /** Available extensions parameters */ + } extensions; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec) return false; /* Kernel clipping was a DRI1 misfeature */ - if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) { + if (!(exec->flags & (I915_EXEC_FENCE_ARRAY | +I915_EXEC_USE_EXTENSIONS))) { if (exec->num_cliprects || exec->cliprects_ptr) return false; } @@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb, } } +static const i915_user_extension_fn execbuf_extensions[] = { +}; + +static int +parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args, + struct i915_execbuffer *eb) +{ + eb->extensions.flags = 0; + + if (!(args->flags & I915_EXEC_USE_EXTENSIONS)) + return 0; + + /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot +* have another flag also using it at the same time. +*/ + if (eb->args->flags & I915_EXEC_FENCE_ARRAY) + return -EINVAL; + + if (args->num_cliprects != 0) + return -EINVAL; + + return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr), + execbuf_extensions, + ARRAY_SIZE(execbuf_extensions), + eb); +} + static int i915_gem_do_execbuffer(struct drm_device *dev, struct drm_file *file, @@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (args->flags & I915_EXEC_IS_PINNED) eb.batch_flags |= I915_DISPATCH_PINNED; + err = parse_execbuf2_extensions(args, ); + if (err) + return err; + if (args->flags & I915_EXEC_FENCE_IN) { in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); if (!in_fence) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 469dc512cca3..0a99c26730e1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence { __u32 flags; }; +enum drm_i915_gem_execbuffer_ext { + DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */ +}; + struct drm_i915_gem_execbuffer2 { /** * List of gem_exec_object2 structs @@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 { __u32 num_cliprects; /** * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY -* is not set. If I915_EXEC_FENCE_ARRAY is set, then this is a -* struct drm_i915_gem_exec_fence *fences. +* & I915_EXEC_USE_EXTENSIONS are not set. +* +* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array +* of struct drm_i915_gem_exec_fence and num_cliprects is the length +* of the array. +* +* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a +* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is +* 0. */ __u64 cliprects_ptr; #define I915_EXEC_RING_MASK (0x3f) @@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_FENCE_SUBMIT (1 << 20) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1)) +/* + * Setting I915_EXEC_USE_EXTENSIONS implies that + * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked + * list of i915_user_extension. Each i915_user_extension node is the base of a + * larger structure. The list of supported structures are listed in the + * drm_i915_gem_execbuffer_ext enum. + */ +#define
[Intel-gfx] [CI 03/13] drm/i915/perf: drop list of streams
At some point in time there was the idea that we could have multiple stream from the same piece of HW but that never materialized and given the hard time we already have making everything work with the submission side, there is no real point having this list of 1 element around. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 6 -- drivers/gpu/drm/i915/i915_perf.c | 16 +--- 2 files changed, 1 insertion(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index db7480831e52..75607450ba00 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1082,11 +1082,6 @@ struct i915_perf_stream { */ struct drm_i915_private *dev_priv; - /** -* @link: Links the stream into ``_i915_private->streams`` -*/ - struct list_head link; - /** * @wakeref: As we keep the device awake while the perf stream is * active, we track our runtime pm reference for later release. @@ -1671,7 +1666,6 @@ struct drm_i915_private { * except exclusive_stream. */ struct mutex lock; - struct list_head streams; /* * The stream currently using the OA unit. If accessed diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c1b764233761..d18cd332afb7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* -* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) if (stream->ops->destroy) stream->ops->destroy(stream); - list_del(>link); - if (stream->ctx) i915_gem_context_put(stream->ctx); @@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, goto err_flags; } - list_add(>link, _priv->perf.streams); - if (param->flags & I915_PERF_FLAG_FD_CLOEXEC) f_flags |= O_CLOEXEC; if (param->flags & I915_PERF_FLAG_FD_NONBLOCK) @@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags); if (stream_fd < 0) { ret = stream_fd; - goto err_open; + goto err_flags; } if (!(param->flags & I915_PERF_FLAG_DISABLED)) @@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, return stream_fd; -err_open: - list_del(>link); err_flags: if (stream->ops->destroy) stream->ops->destroy(stream); @@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) } if (dev_priv->perf.ops.enable_metric_set) { - INIT_LIST_HEAD(_priv->perf.streams); mutex_init(_priv->perf.lock); oa_sample_rate_hard_limit = 1000 * -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi
Reporting this version will help application figure out what level of the support the running kernel provides. v2: Add i915_perf_ioctl_version() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_getparam.c | 4 drivers/gpu/drm/i915/i915_perf.c | 10 ++ drivers/gpu/drm/i915/i915_perf.h | 1 + include/uapi/drm/i915_drm.h | 20 4 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index da6faa84e5b8..bd41cc5ce906 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -5,6 +5,7 @@ #include "gt/intel_engine_user.h" #include "i915_drv.h" +#include "i915_perf.h" int i915_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) @@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_MMAP_GTT_COHERENT: value = INTEL_INFO(i915)->has_coherent_ggtt; break; + case I915_PARAM_PERF_REVISION: + value = i915_perf_ioctl_version(); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9d5a3522aa35..40a1ec2bc96b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv) dev_priv->perf.initialized = false; } + +/** + * i915_perf_ioctl_version - Version of the i915-perf subsystem + * + * This version number is used by userspace to detect available features. + */ +int i915_perf_ioctl_version(void) +{ + return 1; +} diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h index a412b16d9ffc..95549de65212 100644 --- a/drivers/gpu/drm/i915/i915_perf.h +++ b/drivers/gpu/drm/i915/i915_perf.h @@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915); void i915_perf_fini(struct drm_i915_private *i915); void i915_perf_register(struct drm_i915_private *i915); void i915_perf_unregister(struct drm_i915_private *i915); +int i915_perf_ioctl_version(void); int i915_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 3d031e81648b..e98c9a7baa91 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54 +/* + * Revision of the i915-perf uAPI. The value returned helps determine what + * i915-perf features are available. See drm_i915_perf_property_id. + */ +#define I915_PARAM_PERF_REVISION 55 + /* Must be kept compact -- no holes and well documented */ typedef struct drm_i915_getparam { @@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id { * Open the stream for a specific context handle (as used with * execbuffer2). A stream opened for a specific context this way * won't typically require root privileges. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_CTX_HANDLE = 1, /** * A value of 1 requests the inclusion of raw OA unit reports as * part of stream samples. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_SAMPLE_OA, /** * The value specifies which set of OA unit metrics should be * be configured, defining the contents of any OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_METRICS_SET, /** * The value specifies the size and layout of OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_FORMAT, @@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id { * from this exponent as follows: * * 80ns * 2^(period_exponent + 1) +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_EXPONENT, @@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param { * to close and re-open a stream with the same configuration. * * It's undefined whether any pending data for the stream will be lost. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0) @@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param { * Disable data capture for a stream. * * It is an error to try and read a stream that is disabled. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_DISABLE_IO('i',
[Intel-gfx] [CI 06/13] drm/i915/perf: move perf types to their own header
Following a pattern used throughout the driver. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h| 300 +- drivers/gpu/drm/i915/i915_perf.h | 2 + drivers/gpu/drm/i915/i915_perf_types.h | 328 + 3 files changed, 331 insertions(+), 299 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 274a1193d4f0..f4145ae6ab6e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -92,6 +92,7 @@ #include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" +#include "i915_perf_types.h" #include "i915_request.h" #include "i915_scheduler.h" #include "gt/intel_timeline.h" @@ -979,305 +980,6 @@ struct intel_wm_config { bool sprites_scaled; }; -struct i915_oa_format { - u32 format; - int size; -}; - -struct i915_oa_reg { - i915_reg_t addr; - u32 value; -}; - -struct i915_oa_config { - char uuid[UUID_STRING_LEN + 1]; - int id; - - const struct i915_oa_reg *mux_regs; - u32 mux_regs_len; - const struct i915_oa_reg *b_counter_regs; - u32 b_counter_regs_len; - const struct i915_oa_reg *flex_regs; - u32 flex_regs_len; - - struct attribute_group sysfs_metric; - struct attribute *attrs[2]; - struct device_attribute sysfs_metric_id; - - atomic_t ref_count; -}; - -struct i915_perf_stream; - -/** - * struct i915_perf_stream_ops - the OPs to support a specific stream type - */ -struct i915_perf_stream_ops { - /** -* @enable: Enables the collection of HW samples, either in response to -* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened -* without `I915_PERF_FLAG_DISABLED`. -*/ - void (*enable)(struct i915_perf_stream *stream); - - /** -* @disable: Disables the collection of HW samples, either in response -* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying -* the stream. -*/ - void (*disable)(struct i915_perf_stream *stream); - - /** -* @poll_wait: Call poll_wait, passing a wait queue that will be woken -* once there is something ready to read() for the stream -*/ - void (*poll_wait)(struct i915_perf_stream *stream, - struct file *file, - poll_table *wait); - - /** -* @wait_unlocked: For handling a blocking read, wait until there is -* something to ready to read() for the stream. E.g. wait on the same -* wait queue that would be passed to poll_wait(). -*/ - int (*wait_unlocked)(struct i915_perf_stream *stream); - - /** -* @read: Copy buffered metrics as records to userspace -* **buf**: the userspace, destination buffer -* **count**: the number of bytes to copy, requested by userspace -* **offset**: zero at the start of the read, updated as the read -* proceeds, it represents how many bytes have been copied so far and -* the buffer offset for copying the next record. -* -* Copy as many buffered i915 perf samples and records for this stream -* to userspace as will fit in the given buffer. -* -* Only write complete records; returning -%ENOSPC if there isn't room -* for a complete record. -* -* Return any error condition that results in a short read such as -* -%ENOSPC or -%EFAULT, even though these may be squashed before -* returning to userspace. -*/ - int (*read)(struct i915_perf_stream *stream, - char __user *buf, - size_t count, - size_t *offset); - - /** -* @destroy: Cleanup any stream specific resources. -* -* The stream will always be disabled before this is called. -*/ - void (*destroy)(struct i915_perf_stream *stream); -}; - -/** - * struct i915_perf_stream - state for a single open stream FD - */ -struct i915_perf_stream { - /** -* @dev_priv: i915 drm device -*/ - struct drm_i915_private *dev_priv; - - /** -* @wakeref: As we keep the device awake while the perf stream is -* active, we track our runtime pm reference for later release. -*/ - intel_wakeref_t wakeref; - - /** -* @engine: Engine associated with this performance stream. -*/ - struct intel_engine_cs *engine; - - /** -* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` -* properties given when opening a stream, representing the contents -* of a single sample as read() by userspace. -*/ - u32 sample_flags; - - /** -* @sample_size: Considering the configured contents of
[Intel-gfx] [CI 03/13] drm/i915/perf: drop list of streams
At some point in time there was the idea that we could have multiple stream from the same piece of HW but that never materialized and given the hard time we already have making everything work with the submission side, there is no real point having this list of 1 element around. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 6 -- drivers/gpu/drm/i915/i915_perf.c | 16 +--- 2 files changed, 1 insertion(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index db7480831e52..75607450ba00 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1082,11 +1082,6 @@ struct i915_perf_stream { */ struct drm_i915_private *dev_priv; - /** -* @link: Links the stream into ``_i915_private->streams`` -*/ - struct list_head link; - /** * @wakeref: As we keep the device awake while the perf stream is * active, we track our runtime pm reference for later release. @@ -1671,7 +1666,6 @@ struct drm_i915_private { * except exclusive_stream. */ struct mutex lock; - struct list_head streams; /* * The stream currently using the OA unit. If accessed diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c1b764233761..d18cd332afb7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* -* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) if (stream->ops->destroy) stream->ops->destroy(stream); - list_del(>link); - if (stream->ctx) i915_gem_context_put(stream->ctx); @@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, goto err_flags; } - list_add(>link, _priv->perf.streams); - if (param->flags & I915_PERF_FLAG_FD_CLOEXEC) f_flags |= O_CLOEXEC; if (param->flags & I915_PERF_FLAG_FD_NONBLOCK) @@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags); if (stream_fd < 0) { ret = stream_fd; - goto err_open; + goto err_flags; } if (!(param->flags & I915_PERF_FLAG_DISABLED)) @@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, return stream_fd; -err_open: - list_del(>link); err_flags: if (stream->ops->destroy) stream->ops->destroy(stream); @@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) } if (dev_priv->perf.ops.enable_metric_set) { - INIT_LIST_HEAD(_priv->perf.streams); mutex_init(_priv->perf.lock); oa_sample_rate_hard_limit = 1000 * -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 06/13] drm/i915/perf: move perf types to their own header
Following a pattern used throughout the driver. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h| 300 +- drivers/gpu/drm/i915/i915_perf.h | 2 + drivers/gpu/drm/i915/i915_perf_types.h | 328 + 3 files changed, 331 insertions(+), 299 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 274a1193d4f0..f4145ae6ab6e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -92,6 +92,7 @@ #include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" +#include "i915_perf_types.h" #include "i915_request.h" #include "i915_scheduler.h" #include "gt/intel_timeline.h" @@ -979,305 +980,6 @@ struct intel_wm_config { bool sprites_scaled; }; -struct i915_oa_format { - u32 format; - int size; -}; - -struct i915_oa_reg { - i915_reg_t addr; - u32 value; -}; - -struct i915_oa_config { - char uuid[UUID_STRING_LEN + 1]; - int id; - - const struct i915_oa_reg *mux_regs; - u32 mux_regs_len; - const struct i915_oa_reg *b_counter_regs; - u32 b_counter_regs_len; - const struct i915_oa_reg *flex_regs; - u32 flex_regs_len; - - struct attribute_group sysfs_metric; - struct attribute *attrs[2]; - struct device_attribute sysfs_metric_id; - - atomic_t ref_count; -}; - -struct i915_perf_stream; - -/** - * struct i915_perf_stream_ops - the OPs to support a specific stream type - */ -struct i915_perf_stream_ops { - /** -* @enable: Enables the collection of HW samples, either in response to -* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened -* without `I915_PERF_FLAG_DISABLED`. -*/ - void (*enable)(struct i915_perf_stream *stream); - - /** -* @disable: Disables the collection of HW samples, either in response -* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying -* the stream. -*/ - void (*disable)(struct i915_perf_stream *stream); - - /** -* @poll_wait: Call poll_wait, passing a wait queue that will be woken -* once there is something ready to read() for the stream -*/ - void (*poll_wait)(struct i915_perf_stream *stream, - struct file *file, - poll_table *wait); - - /** -* @wait_unlocked: For handling a blocking read, wait until there is -* something to ready to read() for the stream. E.g. wait on the same -* wait queue that would be passed to poll_wait(). -*/ - int (*wait_unlocked)(struct i915_perf_stream *stream); - - /** -* @read: Copy buffered metrics as records to userspace -* **buf**: the userspace, destination buffer -* **count**: the number of bytes to copy, requested by userspace -* **offset**: zero at the start of the read, updated as the read -* proceeds, it represents how many bytes have been copied so far and -* the buffer offset for copying the next record. -* -* Copy as many buffered i915 perf samples and records for this stream -* to userspace as will fit in the given buffer. -* -* Only write complete records; returning -%ENOSPC if there isn't room -* for a complete record. -* -* Return any error condition that results in a short read such as -* -%ENOSPC or -%EFAULT, even though these may be squashed before -* returning to userspace. -*/ - int (*read)(struct i915_perf_stream *stream, - char __user *buf, - size_t count, - size_t *offset); - - /** -* @destroy: Cleanup any stream specific resources. -* -* The stream will always be disabled before this is called. -*/ - void (*destroy)(struct i915_perf_stream *stream); -}; - -/** - * struct i915_perf_stream - state for a single open stream FD - */ -struct i915_perf_stream { - /** -* @dev_priv: i915 drm device -*/ - struct drm_i915_private *dev_priv; - - /** -* @wakeref: As we keep the device awake while the perf stream is -* active, we track our runtime pm reference for later release. -*/ - intel_wakeref_t wakeref; - - /** -* @engine: Engine associated with this performance stream. -*/ - struct intel_engine_cs *engine; - - /** -* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` -* properties given when opening a stream, representing the contents -* of a single sample as read() by userspace. -*/ - u32 sample_flags; - - /** -* @sample_size: Considering the configured contents of
[Intel-gfx] [CI 11/13] drm/i915: add a new perf configuration execbuf parameter
We want the ability to dispatch a set of command buffer to the hardware, each with a different OA configuration. To achieve this, we reuse a couple of fields from the execbuf2 struct (I CAN HAZ execbuf3?) to notify what OA configuration should be used for a batch buffer. This requires the process making the execbuf with this flag to also own the perf fd at the time of execbuf. v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris) Move oa_config vma to active (Chris) v3: Don't drop the lock for engine lookup (Chris) Move OA config vma to active before writing the ringbuffer (Chris) v4: Reuse i915_user_extension_fn Serialize requests with OA config updates v5: Check that the chained extension is only present once (Chris) Unpin oa_vma in main path (Chris) v6: Use BIT_ULL (Chris) v7: Hold drm.struct_mutex when serializing the request with OA config (Chris) v8: Remove active request from engine (Lionel) v9: Move fetching OA configuration pass engine pinning (Lionel) Lock VMA before moving to active (Chris) v10: Fix leak on perf_fd (Lionel) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +- drivers/gpu/drm/i915/i915_getparam.c | 4 + include/uapi/drm/i915_drm.h | 39 + 3 files changed, 188 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 46ad8d9642d1..d416b60c94bb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -24,6 +24,7 @@ #include "i915_gem_clflush.h" #include "i915_gem_context.h" #include "i915_gem_ioctls.h" +#include "i915_perf.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -284,7 +285,12 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; + struct drm_i915_gem_execbuffer_ext_perf perf_config; } extensions; + + struct file *perf_file; + struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */ + struct i915_vma *oa_vma; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma) return err; } + +static int +eb_get_oa_config(struct i915_execbuffer *eb) +{ + struct drm_i915_gem_object *oa_bo; + int err = 0; + + eb->perf_file = NULL; + eb->oa_config = NULL; + eb->oa_vma = NULL; + + if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) == 0) + return 0; + + eb->perf_file = fget(eb->extensions.perf_config.perf_fd); + if (!eb->perf_file) + return -EINVAL; + + err = i915_mutex_lock_interruptible(>i915->drm); + if (err) + return err; + + if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream) + err = -EINVAL; + + mutex_unlock(>i915->drm.struct_mutex); + + if (err) + return err; + + if (eb->i915->perf.exclusive_stream->engine != eb->engine) + return -EINVAL; + + err = i915_perf_get_oa_config_and_bo( + eb->i915->perf.exclusive_stream, + eb->extensions.perf_config.oa_config, + >oa_config, _bo); + if (err) + return err; + + eb->oa_vma = i915_vma_instance(oa_bo, + >engine->gt->ggtt->vm, NULL); + i915_gem_object_put(oa_bo); + if (IS_ERR(eb->oa_vma)) { + err = PTR_ERR(eb->oa_vma); + eb->oa_vma = NULL; + return err; + } + + return 0; +} + static int __reloc_gpu_alloc(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file *file) spin_unlock(_priv->mm.lock); } +static int eb_oa_config(struct i915_execbuffer *eb) +{ + struct i915_perf_stream *perf_stream; + int err; + + if (!eb->oa_config) + return 0; + + perf_stream = eb->perf_file->private_data; + + err = mutex_lock_interruptible(_stream->config_mutex); + if (err) + return err; + + err = i915_active_request_set(_stream->active_config_rq, + eb->request); + if (err) + goto out; + + /* +* If the config hasn't changed, skip reconfiguring the HW (this is +* subject to a delay we want to avoid has much as possible). +*/ + if (eb->oa_config == perf_stream->oa_config) + goto out; + + i915_vma_lock(eb->oa_vma); +
[Intel-gfx] [CI 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi
Reporting this version will help application figure out what level of the support the running kernel provides. v2: Add i915_perf_ioctl_version() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_getparam.c | 4 drivers/gpu/drm/i915/i915_perf.c | 10 ++ drivers/gpu/drm/i915/i915_perf.h | 1 + include/uapi/drm/i915_drm.h | 20 4 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index da6faa84e5b8..bd41cc5ce906 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -5,6 +5,7 @@ #include "gt/intel_engine_user.h" #include "i915_drv.h" +#include "i915_perf.h" int i915_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) @@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_MMAP_GTT_COHERENT: value = INTEL_INFO(i915)->has_coherent_ggtt; break; + case I915_PARAM_PERF_REVISION: + value = i915_perf_ioctl_version(); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9d5a3522aa35..40a1ec2bc96b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv) dev_priv->perf.initialized = false; } + +/** + * i915_perf_ioctl_version - Version of the i915-perf subsystem + * + * This version number is used by userspace to detect available features. + */ +int i915_perf_ioctl_version(void) +{ + return 1; +} diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h index a412b16d9ffc..95549de65212 100644 --- a/drivers/gpu/drm/i915/i915_perf.h +++ b/drivers/gpu/drm/i915/i915_perf.h @@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915); void i915_perf_fini(struct drm_i915_private *i915); void i915_perf_register(struct drm_i915_private *i915); void i915_perf_unregister(struct drm_i915_private *i915); +int i915_perf_ioctl_version(void); int i915_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 3d031e81648b..e98c9a7baa91 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54 +/* + * Revision of the i915-perf uAPI. The value returned helps determine what + * i915-perf features are available. See drm_i915_perf_property_id. + */ +#define I915_PARAM_PERF_REVISION 55 + /* Must be kept compact -- no holes and well documented */ typedef struct drm_i915_getparam { @@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id { * Open the stream for a specific context handle (as used with * execbuffer2). A stream opened for a specific context this way * won't typically require root privileges. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_CTX_HANDLE = 1, /** * A value of 1 requests the inclusion of raw OA unit reports as * part of stream samples. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_SAMPLE_OA, /** * The value specifies which set of OA unit metrics should be * be configured, defining the contents of any OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_METRICS_SET, /** * The value specifies the size and layout of OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_FORMAT, @@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id { * from this exponent as follows: * * 80ns * 2^(period_exponent + 1) +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_EXPONENT, @@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param { * to close and re-open a stream with the same configuration. * * It's undefined whether any pending data for the stream will be lost. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0) @@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param { * Disable data capture for a stream. * * It is an error to try and read a stream that is disabled. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_DISABLE_IO('i',
[Intel-gfx] [CI 02/13] drm/i915: add syncobj timeline support
Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++ drivers/gpu/drm/i915/i915_drv.c | 3 +- drivers/gpu/drm/i915/i915_getparam.c | 1 + include/uapi/drm/i915_drm.h | 39 +++ 4 files changed, 293 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 4f5fd946ab28..46ad8d9642d1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -214,6 +214,13 @@ enum { * the batchbuffer in trusted mode, otherwise the ioctl is rejected. */ +struct i915_eb_fences { + struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */ + struct dma_fence *dma_fence; + u64 value; + struct dma_fence_chain *chain_fence; +}; + struct i915_execbuffer { struct drm_i915_private *i915; /** i915 backpointer */ struct drm_file *file; /** per-file lookup tables and limits */ @@ -276,6 +283,7 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ + struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; } extensions; }; @@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb, } static void -__free_fence_array(struct drm_syncobj **fences, unsigned int n) +__free_fence_array(struct i915_eb_fences *fences, unsigned int n) { - while (n--) - drm_syncobj_put(ptr_mask_bits(fences[n], 2)); + while (n--) { + drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2)); + dma_fence_put(fences[n].dma_fence); + kfree(fences[n].chain_fence); + } kvfree(fences); } -static struct drm_syncobj ** -get_fence_array(struct drm_i915_gem_execbuffer2 *args, - struct drm_file *file) +static struct i915_eb_fences * +get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences) +{ + struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences = + >extensions.timeline_fences; + struct drm_i915_gem_exec_fence __user *user_fences; + struct i915_eb_fences *fences; + u64 __user *user_values; + u64 num_fences, num_user_fences = timeline_fences->fence_count; + unsigned long n; + int err; + + /* Check multiplication overflow for access_ok() and kvmalloc_array() */ + BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long)); + if (num_user_fences > min_t(unsigned long, + ULONG_MAX / sizeof(*user_fences), + SIZE_MAX / sizeof(*fences))) + return ERR_PTR(-EINVAL); + + user_fences = u64_to_user_ptr(timeline_fences->handles_ptr); + if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences))) + return ERR_PTR(-EFAULT); + + user_values = u64_to_user_ptr(timeline_fences->values_ptr); + if (!access_ok(user_values, num_user_fences * sizeof(*user_values))) + return ERR_PTR(-EFAULT); + + fences = kvmalloc_array(num_user_fences, sizeof(*fences), + __GFP_NOWARN | GFP_KERNEL); + if (!fences) + return ERR_PTR(-ENOMEM); + + BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) & +~__I915_EXEC_FENCE_UNKNOWN_FLAGS); + + for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) { + struct drm_i915_gem_exec_fence user_fence; + struct drm_syncobj *syncobj; + struct dma_fence *fence = NULL; + u64 point; + + if (__copy_from_user(_fence, user_fences++, sizeof(user_fence))) { + err = -EFAULT; + goto err; + } + + if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) { + err = -EINVAL; + goto err; + } + + if (__get_user(point, user_values++)) { + err = -EFAULT; + goto err; + } + + syncobj
[Intel-gfx] [CI 07/13] drm/i915/perf: allow for CS OA configs to be created lazily
Here we introduce a mechanism by which the execbuf part of the i915 driver will be able to request that a batch buffer containing the programming for a particular OA config be created. We'll execute these OA configuration buffers right before executing a set of userspace commands so that a particular user batchbuffer be executed with a given OA configuration. This mechanism essentially allows the userspace driver to go through several OA configuration without having to open/close the i915/perf stream. v2: No need for locking on object OA config object creation (Chris) Flush cpu mapping of OA config (Chris) v3: Properly deal with the perf_metric lock (Chris/Lionel) v4: Fix oa config unref/put when not found (Lionel) v5: Allocate BOs for configurations on the stream instead of globally (Lionel) v6: Fix 64bit division (Chris) v7: Store allocated config BOs into the stream (Lionel) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 + drivers/gpu/drm/i915/i915_drv.h | 4 +- drivers/gpu/drm/i915/i915_perf.c | 270 --- drivers/gpu/drm/i915/i915_perf.h | 26 ++ drivers/gpu/drm/i915/i915_perf_types.h | 15 +- 5 files changed, 273 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index fbad403ab7ac..b6373fbc927d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -135,6 +135,7 @@ /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ #define MI_LRI_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126) #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1) #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) #define MI_SRM_LRM_GLOBAL_GTT(1<<22) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f4145ae6ab6e..7eb31923cde9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1363,8 +1363,8 @@ struct drm_i915_private { struct mutex metrics_lock; /* -* List of dynamic configurations, you need to hold -* dev_priv->perf.metrics_lock to access it. +* List of dynamic configurations (struct i915_oa_config), you +* need to hold dev_priv->perf.metrics_lock to access it. */ struct idr metrics_idr; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 40a1ec2bc96b..c9d0de3050fb 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -367,11 +367,19 @@ struct perf_open_properties { struct intel_engine_cs *engine; }; +struct i915_oa_config_bo { + struct list_head link; + + struct i915_oa_config *oa_config; + struct drm_i915_gem_object *bo; +}; + static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); -static void free_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +void i915_oa_config_release(struct kref *ref) { + struct i915_oa_config *oa_config = container_of(ref, typeof(*oa_config), ref); + if (!PTR_ERR(oa_config->flex_regs)) kfree(oa_config->flex_regs); if (!PTR_ERR(oa_config->b_counter_regs)) @@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private *dev_priv, kfree(oa_config); } -static void put_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs) { - if (!atomic_dec_and_test(_config->ref_count)) - return; + u32 i; + + for (i = 0; i < n_regs; i++) { + if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) { + u32 n_lri = min(n_regs - i, + (u32) MI_LOAD_REGISTER_IMM_MAX_REGS); - free_oa_config(dev_priv, oa_config); + *cs++ = MI_LOAD_REGISTER_IMM(n_lri); + } + *cs++ = i915_mmio_reg_offset(reg_data[i].addr); + *cs++ = reg_data[i].value; + } + + return cs; } -static int get_oa_config(struct drm_i915_private *dev_priv, -int metrics_set, -struct i915_oa_config **out_config) +static struct i915_oa_config_bo* alloc_oa_config_buffer(struct drm_i915_private *i915, + struct i915_oa_config *oa_config) { - int ret; + struct i915_oa_config_bo *oa_bo; + size_t config_length = 0; + u32 *cs; + int err; + + oa_bo = kzalloc(sizeof(*oa_bo),
[Intel-gfx] [CI 04/13] drm/i915/perf: store the associated engine of a stream
We'll use this information later to verify that a client trying to reconfigure the stream does so on the right engine. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_perf.c | 7 +++ 2 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 75607450ba00..274a1193d4f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1088,6 +1088,11 @@ struct i915_perf_stream { */ intel_wakeref_t wakeref; + /** +* @engine: Engine associated with this performance stream. +*/ + struct intel_engine_cs *engine; + /** * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` * properties given when opening a stream, representing the contents diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d18cd332afb7..9d5a3522aa35 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -363,6 +363,8 @@ struct perf_open_properties { int oa_format; bool oa_periodic; int oa_period_exponent; + + struct intel_engine_cs *engine; }; static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); @@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, format_size = dev_priv->perf.oa_formats[props->oa_format].size; + stream->engine = props->engine; + stream->sample_flags |= SAMPLE_OA_REPORT; stream->sample_size += format_size; @@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, return -EINVAL; } + /* At the moment we only support using i915-perf on the RCS. */ + props->engine = dev_priv->engine[RCS0]; + /* Considering that ID = 0 is reserved and assuming that we don't * (currently) expect any configurations to ever specify duplicate * values for a particular property ID then the last _PROP_MAX value is -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 08/13] drm/i915/perf: implement active wait for noa configurations
NOA configuration take some amount of time to apply. That amount of time depends on the size of the GT. There is no documented time for this. For example, past experimentations with powergating configuration changes seem to indicate a 60~70us delay. We go with 500us as default for now which should be over the required amount of time (according to HW architects). v2: Don't forget to save/restore registers used for the wait (Chris) v3: Name used CS_GPR registers (Chris) Fix compile issue due to rebase (Lionel) v4: Fix save/restore helpers (Umesh) v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel) v6: Add missing struct declarations in i915_perf.h Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 24 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 5 + drivers/gpu/drm/i915/i915_debugfs.c | 31 +++ drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 234 ++- drivers/gpu/drm/i915/i915_perf_types.h | 6 + drivers/gpu/drm/i915/i915_reg.h | 4 +- 7 files changed, 302 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index b6373fbc927d..e8ce44841868 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -160,6 +160,7 @@ #define MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */ #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1) #define MI_BATCH_RESOURCE_STREAMER (1<<10) +#define MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/ /* * 3D instructions used by the kernel @@ -238,6 +239,29 @@ #define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1<<0) #define PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */ +#define MI_MATH(x) MI_INSTR(0x1a, (x)-1) +#define MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2)) +/* operands */ +#define MI_ALU_OP_NOOP 0 +#define MI_ALU_OP_LOAD 128 +#define MI_ALU_OP_LOADINV 1152 +#define MI_ALU_OP_LOAD0129 +#define MI_ALU_OP_LOAD11153 +#define MI_ALU_OP_ADD 256 +#define MI_ALU_OP_SUB 257 +#define MI_ALU_OP_AND 258 +#define MI_ALU_OP_OR 259 +#define MI_ALU_OP_XOR 260 +#define MI_ALU_OP_STORE384 +#define MI_ALU_OP_STOREINV 1408 +/* sources */ +#define MI_ALU_SRC_REG(x) (x) /* 0 -> 15 */ +#define MI_ALU_SRC_SRCA32 +#define MI_ALU_SRC_SRCB33 +#define MI_ALU_SRC_ACCU49 +#define MI_ALU_SRC_ZF 50 +#define MI_ALU_SRC_CF 51 + /* * Commands used only by the command parser */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index dc295c196d11..f752b6cf9ea1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -97,6 +97,11 @@ enum intel_gt_scratch_field { /* 8 bytes */ INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256, + /* 6 * 8 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048, + + /* 4 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096, }; #endif /* __INTEL_GT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 708855e051b5..cc17d5c2295f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3578,6 +3578,36 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops, i915_wedged_get, i915_wedged_set, "%llu\n"); +static int +i915_perf_noa_delay_set(void *data, u64 val) +{ + struct drm_i915_private *i915 = data; + + /* This would lead to infinite waits as we're doing timestamp +* difference on the CS with only 32bits. +*/ + if (val > mul_u32_u32(U32_MAX, RUNTIME_INFO(i915)->cs_timestamp_frequency_khz)) + return -EINVAL; + + atomic64_set(>perf.noa_programming_delay, val); + return 0; +} + +static int +i915_perf_noa_delay_get(void *data, u64 *val) +{ + struct drm_i915_private *i915 = data; + + *val = atomic64_read(>perf.noa_programming_delay); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops, + i915_perf_noa_delay_get, + i915_perf_noa_delay_set, + "%llu\n"); + + #define DROP_UNBOUND BIT(0) #define DROP_BOUND BIT(1) #define DROP_RETIREBIT(2) @@ -4354,6 +4384,7 @@ static const struct i915_debugfs_files { const char *name; const struct file_operations *fops; } i915_debugfs_files[] = { + {"i915_perf_noa_delay", _perf_noa_delay_fops}, {"i915_wedged", _wedged_fops}, {"i915_cache_sharing", _cache_sharing_fops}, {"i915_gem_drop_caches", _drop_caches_fops}, diff --git
[Intel-gfx] [CI 10/13] drm/i915/perf: execute OA configuration from command stream
We haven't run into issues with programming the global OA/NOA registers configuration from CPU so far, but HW engineers actually recommend doing this from the command streamer. On TGL in particular one of the clock domain in which some of that programming goes might not be powered when we poke things from the CPU. Since we have a command buffer prepared for the execbuffer side of things, we can reuse that approach here too. This also allows us to significantly reduce the amount of time we hold the main lock. v2: Drop the global lock as much as possible v3: Take global lock to pin global v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel) v5: Move locking to the stream (Lionel) v6: Move active reconfiguration request into i915_perf_stream (Lionel) v7: Pin VMA outside request creation (Chris) Lock VMA before move to active (Chris) v8: Fix double free on stream->initial_oa_config_bo (Lionel) Don't allow interruption when waiting on active config request (Lionel) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 170 - drivers/gpu/drm/i915/i915_perf_types.h | 15 ++- 2 files changed, 124 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index f2b778d84b52..8e3532518139 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1558,18 +1558,23 @@ free_oa_configs(struct i915_perf_stream *stream) static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; + int err; BUG_ON(stream != dev_priv->perf.exclusive_stream); - /* -* Unset exclusive_stream first, it will be checked while disabling -* the metric set on gen8+. -*/ mutex_lock(_priv->drm.struct_mutex); - dev_priv->perf.exclusive_stream = NULL; + mutex_lock(>config_mutex); dev_priv->perf.ops.disable_metric_set(stream); + err = i915_active_request_retire(>active_config_rq, 0, +>config_mutex); + mutex_unlock(>config_mutex); + dev_priv->perf.exclusive_stream = NULL; mutex_unlock(_priv->drm.struct_mutex); + if (err) + DRM_ERROR("Failed to disable perf stream\n"); + + free_oa_buffer(stream); free_noa_wait(stream); @@ -1795,6 +1800,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return PTR_ERR(bo); } + ret = i915_mutex_lock_interruptible(>drm); + if (ret) + goto err_unref; + /* * We pin in GGTT because we jump into this buffer now because * multiple OA config BOs will have a jump to this address and it @@ -1802,10 +1811,13 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0); if (IS_ERR(vma)) { + mutex_unlock(>drm.struct_mutex); ret = PTR_ERR(vma); goto err_unref; } + mutex_unlock(>drm.struct_mutex); + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); if (IS_ERR(batch)) { ret = PTR_ERR(batch); @@ -1939,7 +1951,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return 0; err_unpin: - __i915_vma_unpin(vma); + mutex_lock(>drm.struct_mutex); + i915_vma_unpin_and_release(, 0); + mutex_unlock(>drm.struct_mutex); err_unref: i915_gem_object_put(bo); @@ -1947,50 +1961,73 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return ret; } -static void config_oa_regs(struct drm_i915_private *dev_priv, - const struct i915_oa_reg *regs, - u32 n_regs) +static int emit_oa_config(struct drm_i915_private *i915, + struct i915_perf_stream *stream) { - u32 i; + struct i915_request *rq; + struct i915_vma *vma; + u32 *cs; + int err; - for (i = 0; i < n_regs; i++) { - const struct i915_oa_reg *reg = regs + i; + lockdep_assert_held(>config_mutex); + + vma = i915_vma_instance(stream->initial_oa_config_bo, + >engine->gt->ggtt->vm, NULL); + if (unlikely(IS_ERR(vma))) + return PTR_ERR(vma); - I915_WRITE(reg->addr, reg->value); + err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); + if (err) + goto err_vma_unpin; + + rq = i915_request_create(stream->engine->kernel_context); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_add_request; } -} -static void delay_after_mux(void) -{ - /* -* It apparently takes a fairly long time for a new MUX -* configuration to be be applied after these register writes. -
[Intel-gfx] [CI 01/13] drm/i915: introduce a mechanism to extend execbuf2
We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v1) --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++- include/uapi/drm/i915_drm.h | 26 +++-- 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 27dbcb508055..4f5fd946ab28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -25,6 +25,7 @@ #include "i915_gem_context.h" #include "i915_gem_ioctls.h" #include "i915_trace.h" +#include "i915_user_extensions.h" enum { FORCE_CPU_RELOC = 1, @@ -272,6 +273,10 @@ struct i915_execbuffer { */ int lut_size; struct hlist_head *buckets; /** ht for relocation handles */ + + struct { + u64 flags; /** Available extensions parameters */ + } extensions; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec) return false; /* Kernel clipping was a DRI1 misfeature */ - if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) { + if (!(exec->flags & (I915_EXEC_FENCE_ARRAY | +I915_EXEC_USE_EXTENSIONS))) { if (exec->num_cliprects || exec->cliprects_ptr) return false; } @@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb, } } +static const i915_user_extension_fn execbuf_extensions[] = { +}; + +static int +parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args, + struct i915_execbuffer *eb) +{ + eb->extensions.flags = 0; + + if (!(args->flags & I915_EXEC_USE_EXTENSIONS)) + return 0; + + /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot +* have another flag also using it at the same time. +*/ + if (eb->args->flags & I915_EXEC_FENCE_ARRAY) + return -EINVAL; + + if (args->num_cliprects != 0) + return -EINVAL; + + return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr), + execbuf_extensions, + ARRAY_SIZE(execbuf_extensions), + eb); +} + static int i915_gem_do_execbuffer(struct drm_device *dev, struct drm_file *file, @@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (args->flags & I915_EXEC_IS_PINNED) eb.batch_flags |= I915_DISPATCH_PINNED; + err = parse_execbuf2_extensions(args, ); + if (err) + return err; + if (args->flags & I915_EXEC_FENCE_IN) { in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); if (!in_fence) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 469dc512cca3..0a99c26730e1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence { __u32 flags; }; +enum drm_i915_gem_execbuffer_ext { + DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */ +}; + struct drm_i915_gem_execbuffer2 { /** * List of gem_exec_object2 structs @@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 { __u32 num_cliprects; /** * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY -* is not set. If I915_EXEC_FENCE_ARRAY is set, then this is a -* struct drm_i915_gem_exec_fence *fences. +* & I915_EXEC_USE_EXTENSIONS are not set. +* +* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array +* of struct drm_i915_gem_exec_fence and num_cliprects is the length +* of the array. +* +* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a +* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is +* 0. */ __u64 cliprects_ptr; #define I915_EXEC_RING_MASK (0x3f) @@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_FENCE_SUBMIT (1 << 20) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1)) +/* + * Setting I915_EXEC_USE_EXTENSIONS implies that + * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked + * list of i915_user_extension. Each i915_user_extension node is the base of a + * larger structure. The list of supported structures are listed in the + * drm_i915_gem_execbuffer_ext enum. + */ +#define
[Intel-gfx] [CI 13/13] drm/i915: add support for perf configuration queries
Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 6 + drivers/gpu/drm/i915/i915_perf.c | 3 + drivers/gpu/drm/i915/i915_query.c | 283 ++ include/uapi/drm/i915_drm.h | 65 ++- 4 files changed, 354 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2c6f37219dff..eab42269fc5b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1368,6 +1368,12 @@ struct drm_i915_private { */ struct idr metrics_idr; + /* +* Number of dynamic configurations, you need to hold +* dev_priv->perf.metrics_lock to access it. +*/ + u32 n_metrics; + /* * Lock associated with anything below within this structure * except exclusive_stream. diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 7adc518912bb..372cdf2e7ec8 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3915,6 +3915,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, goto sysfs_err; } + dev_priv->perf.n_metrics++; + mutex_unlock(_priv->perf.metrics_lock); DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id); @@ -3975,6 +3977,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, _config->sysfs_metric); idr_remove(_priv->perf.metrics_idr, *arg); + dev_priv->perf.n_metrics--; mutex_unlock(_priv->perf.metrics_lock); diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index abac5042da2b..89b2821be4a0 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -7,6 +7,7 @@ #include #include "i915_drv.h" +#include "i915_perf.h" #include "i915_query.h" #include @@ -140,10 +141,292 @@ query_engine_info(struct drm_i915_private *i915, return len; } +static int can_copy_perf_config_registers_or_number(u32 user_n_regs, + u64 user_regs_ptr, + u32 kernel_n_regs) +{ + /* +* We'll just put the number of registers, and won't copy the +* register. +*/ + if (user_n_regs == 0) + return 0; + + if (user_n_regs < kernel_n_regs) + return -EINVAL; + + if (!access_ok(u64_to_user_ptr(user_regs_ptr), + 2 * sizeof(u32) * kernel_n_regs)) + return -EFAULT; + + return 0; +} + +static int copy_perf_config_registers_or_number(const struct i915_oa_reg *kernel_regs, + u32 kernel_n_regs, + u64 user_regs_ptr, + u32 *user_n_regs) +{ + u32 r; + + if (*user_n_regs == 0) { + *user_n_regs = kernel_n_regs; + return 0; + } + + *user_n_regs = kernel_n_regs; + + for (r = 0; r < kernel_n_regs; r++) { + u32 __user *user_reg_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2); + u32 __user *user_val_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 + + sizeof(u32)); + int ret; + + ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr), +user_reg_ptr); + if (ret) + return -EFAULT; + + ret = __put_user(kernel_regs[r].value, user_val_ptr); + if (ret) + return -EFAULT; + } + + return 0; +} + +static int query_perf_config_data(struct drm_i915_private *i915, + struct drm_i915_query_item *query_item, + bool use_uuid)
[Intel-gfx] [CI 09/13] drm/i915: add wait flags to i915_active_request_retire
An upcoming change needs not to be interrupted. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_active.c | 4 +++- drivers/gpu/drm/i915/i915_active.h | 5 ++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 6a447f1d0110..c808c28c9464 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref) break; } - err = i915_active_request_retire(>base, BKL(ref)); + err = i915_active_request_retire(>base, +I915_WAIT_INTERRUPTIBLE, +BKL(ref)); if (err) break; } diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index f95058f99057..35a6089b44fd 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request *active) */ static inline int __must_check i915_active_request_retire(struct i915_active_request *active, + unsigned int flags, struct mutex *mutex) { struct i915_request *request; @@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request *active, if (!request) return 0; - ret = i915_request_wait(request, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); + ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT); if (ret < 0) return ret; -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 12/13] drm/i915/perf: allow holding preemption on filtered ctx
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 + drivers/gpu/drm/i915/i915_perf.c | 31 +-- drivers/gpu/drm/i915/i915_perf_types.h| 8 + include/uapi/drm/i915_drm.h | 11 +++ 4 files changed, 55 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d416b60c94bb..33df58e681fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb) if (err) goto out; + /* +* If the perf stream was opened with hold preemption, flag the +* request properly so that the priority of the request is bumped once +* it reaches the execlist ports. +*/ + if (eb->i915->perf.exclusive_stream->hold_preemption) + eb->request->flags |= I915_REQUEST_NOPREEMPT; + /* * If the config hasn't changed, skip reconfiguring the HW (this is * subject to a delay we want to avoid has much as possible). diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 8e3532518139..7adc518912bb 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -343,6 +343,8 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @hold_preemption: Whether the preemption is disabled for the filtered + * context * @ctx_handle: A gem ctx handle for use with @single_context * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format @@ -357,6 +359,7 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 hold_preemption:1; u64 ctx_handle; /* OA sampling state */ @@ -2632,6 +2635,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (WARN_ON(stream->oa_buffer.format_size == 0)) return -EINVAL; + stream->hold_preemption = props->hold_preemption; + stream->oa_buffer.format = dev_priv->perf.oa_formats[props->oa_format].format; @@ -3187,6 +3192,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, } } + if (props->hold_preemption) { + if (!props->single_context) { + DRM_DEBUG("preemption disable with no context\n"); + ret = -EINVAL; + goto err; + } + privileged_op = true; + } + /* * On Haswell the OA unit supports clock gating off for a specific * context and in this mode there's no visibility of metrics for the @@ -3201,8 +3215,9 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to * enable the OA unit by default. */ - if (IS_HASWELL(dev_priv) && specific_ctx) + if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) { privileged_op = false; + } /* Similar to perf's kernel.perf_paranoid_cpu sysctl option * we check a
[Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting
And give up if we never even make it to the start. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592 Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- tests/perf_pmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index d392a67d4..8a06e5d44 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) while (!igt_spin_has_started(spin)) { unsigned long t = igt_nsec_elapsed(); + igt_assert(gem_bo_busy(fd, spin->handle)); if ((t - timeout) > 250e6) { timeout = t; igt_warn("Spinner not running after %.2fms\n", (double)t / 1e6); + igt_assert(t < 2e9); } } } else { @@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t *spin) usleep(500e3); /* Better than nothing! */ } + igt_assert(gem_bo_busy(fd, spin->handle)); return igt_nsec_elapsed(); } -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2
== Series Details == Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2 URL : https://patchwork.freedesktop.org/series/66418/ State : warning == Summary == $ dim checkpatch origin/drm-tip 66b565b57b3f drm/i915: introduce a mechanism to extend execbuf2 -:141: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV) #141: FILE: include/uapi/drm/i915_drm.h:1165: +#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_USE_EXTENSIONS<<1)) ^ total: 0 errors, 0 warnings, 1 checks, 113 lines checked 503c88dc3bc0 drm/i915: add syncobj timeline support -:25: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #25: https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html -:381: WARNING:TYPO_SPELLING: 'transfered' may be misspelled - perhaps 'transferred'? #381: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2616: +* The chain's ownership is transfered to the -:412: ERROR:CODE_INDENT: code indent should use tabs where possible #412: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2647: +[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,$ -:412: WARNING:LEADING_SPACE: please, no spaces at the start of a line #412: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2647: +[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,$ total: 1 errors, 3 warnings, 0 checks, 541 lines checked 66b65143aa4d drm/i915/perf: drop list of streams 8aca4673ec28 drm/i915/perf: store the associated engine of a stream 8db92539084e drm/i915/perf: introduce a versioning of the i915-perf uapi 8bb8be52ca97 drm/i915/perf: move perf types to their own header -:342: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating? #342: new file mode 100644 -:347: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1 #347: FILE: drivers/gpu/drm/i915/i915_perf_types.h:1: +/* -:348: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead #348: FILE: drivers/gpu/drm/i915/i915_perf_types.h:2: + * SPDX-License-Identifier: MIT total: 0 errors, 3 warnings, 0 checks, 648 lines checked da1c41cf2065 drm/i915/perf: allow for CS OA configs to be created lazily -:103: CHECK:SPACING: No space is necessary after a cast #103: FILE: drivers/gpu/drm/i915/i915_perf.c:399: + (u32) MI_LOAD_REGISTER_IMM_MAX_REGS); -:118: ERROR:POINTER_LOCATION: "foo* bar" should be "foo *bar" #118: FILE: drivers/gpu/drm/i915/i915_perf.c:410: +static struct i915_oa_config_bo* alloc_oa_config_buffer(struct drm_i915_private *i915, total: 1 errors, 0 warnings, 1 checks, 507 lines checked 4ef9530de87e drm/i915/perf: implement active wait for noa configurations -:43: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV) #43: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:242: +#define MI_MATH(x) MI_INSTR(0x1a, (x)-1) ^ -:122: CHECK:LINE_SPACING: Please don't use multiple blank lines #122: FILE: drivers/gpu/drm/i915/i915_debugfs.c:3610: + + -:181: CHECK:LINE_SPACING: Please don't use multiple blank lines #181: FILE: drivers/gpu/drm/i915/i915_perf.c:460: + -:234: CHECK:PREFER_KERNEL_TYPES: Prefer kernel type 'u32' over 'uint32_t' #234: FILE: drivers/gpu/drm/i915/i915_perf.c:1758: + uint32_t d; -:260: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #260: FILE: drivers/gpu/drm/i915/i915_perf.c:1784: + DIV64_U64_ROUND_UP( -:285: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided #285: FILE: drivers/gpu/drm/i915/i915_perf.c:1809: + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); -:293: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #293: FILE: drivers/gpu/drm/i915/i915_perf.c:1817: + cs = save_restore_register( -:297: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #297: FILE: drivers/gpu/drm/i915/i915_perf.c:1821: + cs = save_restore_register( -:397: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #397: FILE: drivers/gpu/drm/i915/i915_perf.c:1921: + cs = save_restore_register( -:401: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #401: FILE: drivers/gpu/drm/i915/i915_perf.c:1925: + cs = save_restore_register( total: 0 errors, 0 warnings, 10 checks, 420 lines checked 0432b1e0d15d drm/i915: add wait flags to i915_active_request_retire 647d2458b7c3 drm/i915/perf: execute OA configuration from command stream -:66: CHECK:LINE_SPACING: Please don't use multiple blank lines #66: FILE: drivers/gpu/drm/i915/i915_perf.c:1577: + + total: 0 errors, 0 warnings, 1 checks, 311 lines checked 8a629929451c drm/i915: add a new perf configuration execbuf parameter -:27: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit
Re: [Intel-gfx] [igt-dev] [PATCH i-g-t 1/2] tools/l3_parity: Unnest exit handlers
On Sat, Sep 07, 2019 at 07:12:56PM +0100, Chris Wilson wrote: > The curse of using libigt where it is not wanted; in this case calling > drop-caches while we hold the forcewake is a recipe for a long wait. > > Signed-off-by: Chris Wilson For the series: Reviewed-by: Petri Latvala > --- > tools/intel_l3_parity.c | 50 - > 1 file changed, 29 insertions(+), 21 deletions(-) > > diff --git a/tools/intel_l3_parity.c b/tools/intel_l3_parity.c > index 06a185c91..340f94b1a 100644 > --- a/tools/intel_l3_parity.c > +++ b/tools/intel_l3_parity.c > @@ -180,6 +180,7 @@ int main(int argc, char *argv[]) > const char *path[REAL_MAX_SLICES] = {"l3_parity", "l3_parity_slice_1"}; > int row = 0, bank = 0, sbank = 0; > int fd[REAL_MAX_SLICES] = {0}, ret, i; > + int exitcode = EXIT_FAILURE; > int action = '0'; > int daemonize = 0; > int device, dir; > @@ -198,13 +199,13 @@ int main(int argc, char *argv[]) > fd[i] = openat(dir, path[i], O_RDWR); > if (fd[i] < 0) { > if (i == 0) /* at least one slice must be supported */ > - exit(77); > + goto skip; > continue; > } > > if (read(fd[i], l3logs[i], NUM_REGS * sizeof(uint32_t)) < 0) { > perror(path[i]); > - exit(77); > + goto skip; > } > assert(lseek(fd[i], 0, SEEK_SET) == 0); > } > @@ -252,45 +253,45 @@ int main(int argc, char *argv[]) > case '?': > case 'h': > usage(argv[0]); > - exit(EXIT_SUCCESS); > + goto success; > case 'H': > printf("Number of slices: %d\n", MAX_SLICES); > printf("Number of banks: %d\n", num_banks()); > printf("Subbanks per bank: %d\n", NUM_SUBBANKS); > printf("Max L3 size: %dK\n", L3_SIZE >> 10); > printf("Has error injection: %s\n", > IS_HASWELL(devid) ? "yes" : "no"); > - exit(EXIT_SUCCESS); > + goto success; > case 'r': > row = atoi(optarg); > if (row >= MAX_ROW) > - exit(EXIT_FAILURE); > + goto failure; > break; > case 'b': > bank = atoi(optarg); > if (bank >= num_banks() || bank >= > MAX_BANKS_PER_SLICE) > - exit(EXIT_FAILURE); > + goto failure; > break; > case 's': > sbank = atoi(optarg); > if (sbank >= NUM_SUBBANKS) > - exit(EXIT_FAILURE); > + goto failure; > break; > case 'w': > which_slice = atoi(optarg); > if (which_slice >= MAX_SLICES) > - exit(EXIT_FAILURE); > + goto failure; > break; > case 'i': > case 'u': > if (!IS_HASWELL(devid)) { > fprintf(stderr, "Error injection > supported on HSW+ only\n"); > - exit(EXIT_FAILURE); > + goto failure; > } > case 'd': > if (optarg) { > ret = sscanf(optarg, "%d,%d,%d", , > , ); > if (ret != 3) > - exit(EXIT_FAILURE); > + goto failure; > } > case 'a': > case 'l': > @@ -298,24 +299,24 @@ int main(int argc, char *argv[]) > case 'L': > if (action != '0') { > fprintf(stderr, "Only one action may be > specified\n"); > - exit(EXIT_FAILURE); > + goto failure; > } > action = c; > break; > default: > - abort(); > +
[Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2
== Series Details == Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2 URL : https://patchwork.freedesktop.org/series/66406/ State : failure == Summary == CALLscripts/checksyscalls.sh CALLscripts/atomic/check-atomics.sh DESCEND objtool CHK include/generated/compile.h CC [M] drivers/gpu/drm/i915/i915_perf.o drivers/gpu/drm/i915/i915_perf.c: In function ‘i915_oa_stream_init’: drivers/gpu/drm/i915/i915_perf.c:2703:3: error: ignoring return value of ‘i915_active_request_retire’, declared with attribute warn_unused_result [-Werror=unused-result] i915_active_request_retire(>active_config_rq, 0, ^~~~ >config_mutex); ~~ cc1: all warnings being treated as errors scripts/Makefile.build:280: recipe for target 'drivers/gpu/drm/i915/i915_perf.o' failed make[4]: *** [drivers/gpu/drm/i915/i915_perf.o] Error 1 scripts/Makefile.build:497: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:497: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:497: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:1083: recipe for target 'drivers' failed make: *** [drivers] Error 2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 12/13] drm/i915/perf: allow holding preemption on filtered ctx
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 + drivers/gpu/drm/i915/i915_perf.c | 31 +-- drivers/gpu/drm/i915/i915_perf_types.h| 8 + include/uapi/drm/i915_drm.h | 11 +++ 4 files changed, 55 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d416b60c94bb..33df58e681fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb) if (err) goto out; + /* +* If the perf stream was opened with hold preemption, flag the +* request properly so that the priority of the request is bumped once +* it reaches the execlist ports. +*/ + if (eb->i915->perf.exclusive_stream->hold_preemption) + eb->request->flags |= I915_REQUEST_NOPREEMPT; + /* * If the config hasn't changed, skip reconfiguring the HW (this is * subject to a delay we want to avoid has much as possible). diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index abbcf3ec654c..fd12318e7a90 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -343,6 +343,8 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @hold_preemption: Whether the preemption is disabled for the filtered + * context * @ctx_handle: A gem ctx handle for use with @single_context * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format @@ -357,6 +359,7 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 hold_preemption:1; u64 ctx_handle; /* OA sampling state */ @@ -2632,6 +2635,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (WARN_ON(stream->oa_buffer.format_size == 0)) return -EINVAL; + stream->hold_preemption = props->hold_preemption; + stream->oa_buffer.format = dev_priv->perf.oa_formats[props->oa_format].format; @@ -3191,6 +3196,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, } } + if (props->hold_preemption) { + if (!props->single_context) { + DRM_DEBUG("preemption disable with no context\n"); + ret = -EINVAL; + goto err; + } + privileged_op = true; + } + /* * On Haswell the OA unit supports clock gating off for a specific * context and in this mode there's no visibility of metrics for the @@ -3205,8 +3219,9 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to * enable the OA unit by default. */ - if (IS_HASWELL(dev_priv) && specific_ctx) + if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) { privileged_op = false; + } /* Similar to perf's kernel.perf_paranoid_cpu sysctl option * we check a
[Intel-gfx] [CI 08/13] drm/i915/perf: implement active wait for noa configurations
NOA configuration take some amount of time to apply. That amount of time depends on the size of the GT. There is no documented time for this. For example, past experimentations with powergating configuration changes seem to indicate a 60~70us delay. We go with 500us as default for now which should be over the required amount of time (according to HW architects). v2: Don't forget to save/restore registers used for the wait (Chris) v3: Name used CS_GPR registers (Chris) Fix compile issue due to rebase (Lionel) v4: Fix save/restore helpers (Umesh) v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel) v6: Add missing struct declarations in i915_perf.h Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 24 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 5 + drivers/gpu/drm/i915/i915_debugfs.c | 31 +++ drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 234 ++- drivers/gpu/drm/i915/i915_perf_types.h | 6 + drivers/gpu/drm/i915/i915_reg.h | 4 +- 7 files changed, 302 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index b6373fbc927d..e8ce44841868 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -160,6 +160,7 @@ #define MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */ #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1) #define MI_BATCH_RESOURCE_STREAMER (1<<10) +#define MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/ /* * 3D instructions used by the kernel @@ -238,6 +239,29 @@ #define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1<<0) #define PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */ +#define MI_MATH(x) MI_INSTR(0x1a, (x)-1) +#define MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2)) +/* operands */ +#define MI_ALU_OP_NOOP 0 +#define MI_ALU_OP_LOAD 128 +#define MI_ALU_OP_LOADINV 1152 +#define MI_ALU_OP_LOAD0129 +#define MI_ALU_OP_LOAD11153 +#define MI_ALU_OP_ADD 256 +#define MI_ALU_OP_SUB 257 +#define MI_ALU_OP_AND 258 +#define MI_ALU_OP_OR 259 +#define MI_ALU_OP_XOR 260 +#define MI_ALU_OP_STORE384 +#define MI_ALU_OP_STOREINV 1408 +/* sources */ +#define MI_ALU_SRC_REG(x) (x) /* 0 -> 15 */ +#define MI_ALU_SRC_SRCA32 +#define MI_ALU_SRC_SRCB33 +#define MI_ALU_SRC_ACCU49 +#define MI_ALU_SRC_ZF 50 +#define MI_ALU_SRC_CF 51 + /* * Commands used only by the command parser */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index dc295c196d11..f752b6cf9ea1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -97,6 +97,11 @@ enum intel_gt_scratch_field { /* 8 bytes */ INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256, + /* 6 * 8 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048, + + /* 4 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096, }; #endif /* __INTEL_GT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 708855e051b5..cc17d5c2295f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3578,6 +3578,36 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops, i915_wedged_get, i915_wedged_set, "%llu\n"); +static int +i915_perf_noa_delay_set(void *data, u64 val) +{ + struct drm_i915_private *i915 = data; + + /* This would lead to infinite waits as we're doing timestamp +* difference on the CS with only 32bits. +*/ + if (val > mul_u32_u32(U32_MAX, RUNTIME_INFO(i915)->cs_timestamp_frequency_khz)) + return -EINVAL; + + atomic64_set(>perf.noa_programming_delay, val); + return 0; +} + +static int +i915_perf_noa_delay_get(void *data, u64 *val) +{ + struct drm_i915_private *i915 = data; + + *val = atomic64_read(>perf.noa_programming_delay); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops, + i915_perf_noa_delay_get, + i915_perf_noa_delay_set, + "%llu\n"); + + #define DROP_UNBOUND BIT(0) #define DROP_BOUND BIT(1) #define DROP_RETIREBIT(2) @@ -4354,6 +4384,7 @@ static const struct i915_debugfs_files { const char *name; const struct file_operations *fops; } i915_debugfs_files[] = { + {"i915_perf_noa_delay", _perf_noa_delay_fops}, {"i915_wedged", _wedged_fops}, {"i915_cache_sharing", _cache_sharing_fops}, {"i915_gem_drop_caches", _drop_caches_fops}, diff --git
[Intel-gfx] [CI 11/13] drm/i915: add a new perf configuration execbuf parameter
We want the ability to dispatch a set of command buffer to the hardware, each with a different OA configuration. To achieve this, we reuse a couple of fields from the execbuf2 struct (I CAN HAZ execbuf3?) to notify what OA configuration should be used for a batch buffer. This requires the process making the execbuf with this flag to also own the perf fd at the time of execbuf. v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris) Move oa_config vma to active (Chris) v3: Don't drop the lock for engine lookup (Chris) Move OA config vma to active before writing the ringbuffer (Chris) v4: Reuse i915_user_extension_fn Serialize requests with OA config updates v5: Check that the chained extension is only present once (Chris) Unpin oa_vma in main path (Chris) v6: Use BIT_ULL (Chris) v7: Hold drm.struct_mutex when serializing the request with OA config (Chris) v8: Remove active request from engine (Lionel) v9: Move fetching OA configuration pass engine pinning (Lionel) Lock VMA before moving to active (Chris) v10: Fix leak on perf_fd (Lionel) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +- drivers/gpu/drm/i915/i915_getparam.c | 4 + include/uapi/drm/i915_drm.h | 39 + 3 files changed, 188 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 46ad8d9642d1..d416b60c94bb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -24,6 +24,7 @@ #include "i915_gem_clflush.h" #include "i915_gem_context.h" #include "i915_gem_ioctls.h" +#include "i915_perf.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -284,7 +285,12 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; + struct drm_i915_gem_execbuffer_ext_perf perf_config; } extensions; + + struct file *perf_file; + struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */ + struct i915_vma *oa_vma; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma) return err; } + +static int +eb_get_oa_config(struct i915_execbuffer *eb) +{ + struct drm_i915_gem_object *oa_bo; + int err = 0; + + eb->perf_file = NULL; + eb->oa_config = NULL; + eb->oa_vma = NULL; + + if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) == 0) + return 0; + + eb->perf_file = fget(eb->extensions.perf_config.perf_fd); + if (!eb->perf_file) + return -EINVAL; + + err = i915_mutex_lock_interruptible(>i915->drm); + if (err) + return err; + + if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream) + err = -EINVAL; + + mutex_unlock(>i915->drm.struct_mutex); + + if (err) + return err; + + if (eb->i915->perf.exclusive_stream->engine != eb->engine) + return -EINVAL; + + err = i915_perf_get_oa_config_and_bo( + eb->i915->perf.exclusive_stream, + eb->extensions.perf_config.oa_config, + >oa_config, _bo); + if (err) + return err; + + eb->oa_vma = i915_vma_instance(oa_bo, + >engine->gt->ggtt->vm, NULL); + i915_gem_object_put(oa_bo); + if (IS_ERR(eb->oa_vma)) { + err = PTR_ERR(eb->oa_vma); + eb->oa_vma = NULL; + return err; + } + + return 0; +} + static int __reloc_gpu_alloc(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file *file) spin_unlock(_priv->mm.lock); } +static int eb_oa_config(struct i915_execbuffer *eb) +{ + struct i915_perf_stream *perf_stream; + int err; + + if (!eb->oa_config) + return 0; + + perf_stream = eb->perf_file->private_data; + + err = mutex_lock_interruptible(_stream->config_mutex); + if (err) + return err; + + err = i915_active_request_set(_stream->active_config_rq, + eb->request); + if (err) + goto out; + + /* +* If the config hasn't changed, skip reconfiguring the HW (this is +* subject to a delay we want to avoid has much as possible). +*/ + if (eb->oa_config == perf_stream->oa_config) + goto out; + + i915_vma_lock(eb->oa_vma); +
[Intel-gfx] [CI 13/13] drm/i915: add support for perf configuration queries
Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 6 + drivers/gpu/drm/i915/i915_perf.c | 3 + drivers/gpu/drm/i915/i915_query.c | 283 ++ include/uapi/drm/i915_drm.h | 65 ++- 4 files changed, 354 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2c6f37219dff..eab42269fc5b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1368,6 +1368,12 @@ struct drm_i915_private { */ struct idr metrics_idr; + /* +* Number of dynamic configurations, you need to hold +* dev_priv->perf.metrics_lock to access it. +*/ + u32 n_metrics; + /* * Lock associated with anything below within this structure * except exclusive_stream. diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index fd12318e7a90..40a02838b68c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3919,6 +3919,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, goto sysfs_err; } + dev_priv->perf.n_metrics++; + mutex_unlock(_priv->perf.metrics_lock); DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id); @@ -3979,6 +3981,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, _config->sysfs_metric); idr_remove(_priv->perf.metrics_idr, *arg); + dev_priv->perf.n_metrics--; mutex_unlock(_priv->perf.metrics_lock); diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index abac5042da2b..89b2821be4a0 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -7,6 +7,7 @@ #include #include "i915_drv.h" +#include "i915_perf.h" #include "i915_query.h" #include @@ -140,10 +141,292 @@ query_engine_info(struct drm_i915_private *i915, return len; } +static int can_copy_perf_config_registers_or_number(u32 user_n_regs, + u64 user_regs_ptr, + u32 kernel_n_regs) +{ + /* +* We'll just put the number of registers, and won't copy the +* register. +*/ + if (user_n_regs == 0) + return 0; + + if (user_n_regs < kernel_n_regs) + return -EINVAL; + + if (!access_ok(u64_to_user_ptr(user_regs_ptr), + 2 * sizeof(u32) * kernel_n_regs)) + return -EFAULT; + + return 0; +} + +static int copy_perf_config_registers_or_number(const struct i915_oa_reg *kernel_regs, + u32 kernel_n_regs, + u64 user_regs_ptr, + u32 *user_n_regs) +{ + u32 r; + + if (*user_n_regs == 0) { + *user_n_regs = kernel_n_regs; + return 0; + } + + *user_n_regs = kernel_n_regs; + + for (r = 0; r < kernel_n_regs; r++) { + u32 __user *user_reg_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2); + u32 __user *user_val_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 + + sizeof(u32)); + int ret; + + ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr), +user_reg_ptr); + if (ret) + return -EFAULT; + + ret = __put_user(kernel_regs[r].value, user_val_ptr); + if (ret) + return -EFAULT; + } + + return 0; +} + +static int query_perf_config_data(struct drm_i915_private *i915, + struct drm_i915_query_item *query_item, + bool use_uuid)
Re: [Intel-gfx] [PATCH i-g-t] i915/perf_pmu: Check on the health of the spinner while waiting
Quoting Tvrtko Ursulin (2019-09-09 10:19:08) > > On 09/09/2019 08:12, Chris Wilson wrote: > > And give up if we never even make it to the start. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111592 > > Signed-off-by: Chris Wilson > > Cc: Tvrtko Ursulin > > --- > > tests/perf_pmu.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c > > index d392a67d4..8a06e5d44 100644 > > --- a/tests/perf_pmu.c > > +++ b/tests/perf_pmu.c > > @@ -191,10 +191,12 @@ static unsigned long __spin_wait(int fd, igt_spin_t > > *spin) > > while (!igt_spin_has_started(spin)) { > > unsigned long t = igt_nsec_elapsed(); > > > > + igt_assert(gem_bo_busy(fd, spin->handle)); > > if ((t - timeout) > 250e6) { > > timeout = t; > > igt_warn("Spinner not running after %.2fms\n", > >(double)t / 1e6); > + > > igt_assert(t < 2e9); > > } > > } > > } else { > > @@ -202,6 +204,7 @@ static unsigned long __spin_wait(int fd, igt_spin_t > > *spin) > > usleep(500e3); /* Better than nothing! */ > > } > > > > + igt_assert(gem_bo_busy(fd, spin->handle)); > > return igt_nsec_elapsed(); > > } > > > > > > The 2s timeout for batch to start executing sounds okay. > > I'd pull up and consolidate the bo_busy checks into one at the top of > the function, since it is only telling us batch has been submitted. Or > you are thinking the second check brings value in checking batch is > still executing, hasn't failed or something? The thinking is to catch if we terminate the batch via hangcheck before writing the dword. I think there's value in knowing if we are slow vs dead. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2
== Series Details == Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2 URL : https://patchwork.freedesktop.org/series/66418/ State : warning == Summary == $ dim sparse origin/drm-tip Sparse version: v0.6.0 Commit: drm/i915: introduce a mechanism to extend execbuf2 Okay! Commit: drm/i915: add syncobj timeline support Okay! Commit: drm/i915/perf: drop list of streams +drivers/gpu/drm/i915/i915_perf.c:1436:15: warning: memset with byte count of 16777216 +drivers/gpu/drm/i915/i915_perf.c:1492:15: warning: memset with byte count of 16777216 -O:drivers/gpu/drm/i915/i915_perf.c:1436:15: warning: memset with byte count of 16777216 -O:drivers/gpu/drm/i915/i915_perf.c:1495:15: warning: memset with byte count of 16777216 Commit: drm/i915/perf: store the associated engine of a stream Okay! Commit: drm/i915/perf: introduce a versioning of the i915-perf uapi Okay! Commit: drm/i915/perf: move perf types to their own header Okay! Commit: drm/i915/perf: allow for CS OA configs to be created lazily Okay! Commit: drm/i915/perf: implement active wait for noa configurations Okay! Commit: drm/i915: add wait flags to i915_active_request_retire Okay! Commit: drm/i915/perf: execute OA configuration from command stream Okay! Commit: drm/i915: add a new perf configuration execbuf parameter Okay! Commit: drm/i915/perf: allow holding preemption on filtered ctx Okay! Commit: drm/i915: add support for perf configuration queries Okay! ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/6] drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel
Being a "low-level" test, we opt to bypass the normal bind/unbind hooks for the lower level insert_entries/clear_range. For ggtt, the bind/unbind hooks provide the runtime wakeref and so we must also handle this in exercising the low level hooks. <4> [538.151672] RPM raw-wakeref not held <4> [538.151825] WARNING: CPU: 0 PID: 11 at ./drivers/gpu/drm/i915/intel_runtime_pm.h:107 fwtable_read32+0x1be/0x300 [i915] <4> [538.151830] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp btusb btrtl btbcm x86_pkg_temp_thermal coretemp btintel crct10dif_pclmul bluetooth crc32_pclmul snd_intel_nhlt snd_hda_codec ecdh_generic ghash_clmulni_intel ecc snd_hwdep snd_hda_core lpc_ich r8169 realtek snd_pcm mei_me mei prime_numbers pinctrl_broxton pinctrl_intel [last unloaded: i915] <4> [538.151861] CPU: 0 PID: 11 Comm: migration/0 Tainted: G U 5.3.0-rc7-CI-Trybot_4938+ #1 <4> [538.151864] Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018 <4> [538.151960] RIP: 0010:fwtable_read32+0x1be/0x300 [i915] <4> [538.151965] Code: e8 e7 f9 5f e0 e9 0b ff ff ff 80 3d d5 8d 26 00 00 0f 85 81 fe ff ff 48 c7 c7 ef 01 bd a0 c6 05 c1 8d 26 00 01 e8 b2 e4 6a e0 <0f> 0b e9 67 fe ff ff 80 3d ad 8d 26 00 00 0f 85 65 fe ff ff 48 c7 <4> [538.151969] RSP: 0018:c907be10 EFLAGS: 00010086 <4> [538.151972] RAX: RBX: 88826be10d50 RCX: 0002 <4> [538.151975] RDX: 8002 RSI: RDI: <4> [538.151978] RBP: R08: R09: <4> [538.151981] R10: R11: c907bcb0 R12: 00101008 <4> [538.151984] R13: R14: c936f638 R15: 0002 <4> [538.151987] FS: () GS:888277a0() knlGS: <4> [538.151990] CS: 0010 DS: ES: CR0: 80050033 <4> [538.151993] CR2: 7fd48e7052f8 CR3: 0521 CR4: 003406f0 <4> [538.151995] Call Trace: <4> [538.152106] bxt_vtd_ggtt_clear_range__cb+0x38/0x40 [i915] Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index 31a51ca1ddcb..598c18d10640 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -293,18 +293,20 @@ static int lowlevel_hole(struct drm_i915_private *i915, mock_vma.node.size = BIT_ULL(size); mock_vma.node.start = addr; - wakeref = intel_runtime_pm_get(>runtime_pm); - vm->insert_entries(vm, _vma, I915_CACHE_NONE, 0); - intel_runtime_pm_put(>runtime_pm, wakeref); + with_intel_runtime_pm(>runtime_pm, wakeref) + vm->insert_entries(vm, _vma, + I915_CACHE_NONE, 0); } count = n; i915_random_reorder(order, count, ); for (n = 0; n < count; n++) { u64 addr = hole_start + order[n] * BIT_ULL(size); + intel_wakeref_t wakeref; GEM_BUG_ON(addr + BIT_ULL(size) > vm->total); - vm->clear_range(vm, addr, BIT_ULL(size)); + with_intel_runtime_pm(>runtime_pm, wakeref) + vm->clear_range(vm, addr, BIT_ULL(size)); } i915_gem_object_unpin_pages(obj); -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Enable iommu on gfx by default
Other than Broadwell being fubar (and Ironlake + g4x being special in their own way), there appears to be little fallout from enabling iommu. (The biggest open question is over performance, TLB misses are much more expensive and that impacts meda/CL/GL throughput.) Enabling iommu/dmar makes our CI much more powerful, instead of a random GPU write causing memcorruption which may or may not impact the system, we get a DMAR fault. So once and for all we will be able to ascertain whether those sporadic memcorruption are truly our fault. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/6] drm/i915/selftests: Tighten the timeout testing for partial mmaps
Currently, if there is time remaining before the start of the loop, we do one full iteration over many possible different chunks within the object. A full loop may take 50+s (depending on speed of indirect GTT mmapings) and we try separately with LINEAR, X and Y -- at which point igt times out. If we check more frequently, we will interrupt the loop upon our timeout -- it is hard to argue that significantly reduces the test coverage despite the dramatic contraction in runtime. In practical terms, the coverage we should prioritise is using different fence setups, forcing verification of the tile row computations over the current preference of checking extracting chunks. Though the exhaustive search is great given an infinite timeout, to improve our current coverage, we also add a randomised smoketest of partial mmaps. Signed-off-by: Chris Wilson --- .../drm/i915/gem/selftests/i915_gem_mman.c| 253 +++--- 1 file changed, 222 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 1d27babff0ce..685726c85991 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -10,6 +10,7 @@ #include "gt/intel_gt_pm.h" #include "huge_gem_object.h" #include "i915_selftest.h" +#include "selftests/i915_random.h" #include "selftests/igt_flush_test.h" struct tile { @@ -75,6 +76,96 @@ static u64 tiled_offset(const struct tile *tile, u64 v) } static int check_partial_mapping(struct drm_i915_gem_object *obj, +const struct tile *tile, +struct rnd_state *prng) +{ + const unsigned long npages = obj->base.size / PAGE_SIZE; + struct i915_ggtt_view view; + struct i915_vma *vma; + unsigned long page; + u32 __iomem *io; + struct page *p; + unsigned int n; + u64 offset; + u32 *cpu; + int err; + + err = i915_gem_object_set_tiling(obj, tile->tiling, tile->stride); + if (err) { + pr_err("Failed to set tiling mode=%u, stride=%u, err=%d\n", + tile->tiling, tile->stride, err); + return err; + } + + GEM_BUG_ON(i915_gem_object_get_tiling(obj) != tile->tiling); + GEM_BUG_ON(i915_gem_object_get_stride(obj) != tile->stride); + + i915_gem_object_lock(obj); + err = i915_gem_object_set_to_gtt_domain(obj, true); + i915_gem_object_unlock(obj); + if (err) { + pr_err("Failed to flush to GTT write domain; err=%d\n", err); + return err; + } + + page = i915_prandom_u32_max_state(npages, prng); + view = compute_partial_view(obj, page, MIN_CHUNK_PAGES); + + vma = i915_gem_object_ggtt_pin(obj, , 0, 0, PIN_MAPPABLE); + if (IS_ERR(vma)) { + pr_err("Failed to pin partial view: offset=%lu; err=%d\n", + page, (int)PTR_ERR(vma)); + return PTR_ERR(vma); + } + + n = page - view.partial.offset; + GEM_BUG_ON(n >= view.partial.size); + + io = i915_vma_pin_iomap(vma); + i915_vma_unpin(vma); + if (IS_ERR(io)) { + pr_err("Failed to iomap partial view: offset=%lu; err=%d\n", + page, (int)PTR_ERR(io)); + err = PTR_ERR(io); + goto out; + } + + iowrite32(page, io + n * PAGE_SIZE / sizeof(*io)); + i915_vma_unpin_iomap(vma); + + offset = tiled_offset(tile, page << PAGE_SHIFT); + if (offset >= obj->base.size) + goto out; + + intel_gt_flush_ggtt_writes(_i915(obj->base.dev)->gt); + + p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT); + cpu = kmap(p) + offset_in_page(offset); + drm_clflush_virt_range(cpu, sizeof(*cpu)); + if (*cpu != (u32)page) { + pr_err("Partial view for %lu [%u] (offset=%llu, size=%u [%llu, row size %u], fence=%d, tiling=%d, stride=%d) misalignment, expected write to page (%llu + %u [0x%llx]) of 0x%x, found 0x%x\n", + page, n, + view.partial.offset, + view.partial.size, + vma->size >> PAGE_SHIFT, + tile->tiling ? tile_row_pages(obj) : 0, + vma->fence ? vma->fence->id : -1, tile->tiling, tile->stride, + offset >> PAGE_SHIFT, + (unsigned int)offset_in_page(offset), + offset, + (u32)page, *cpu); + err = -EINVAL; + } + *cpu = 0; + drm_clflush_virt_range(cpu, sizeof(*cpu)); + kunmap(p); + +out: + i915_vma_destroy(vma); + return err; +} + +static int check_partial_mappings(struct drm_i915_gem_object *obj, const struct tile *tile, unsigned
[Intel-gfx] [PATCH 4/6] drm/i915: Force compilation with intel-iommu for CI validation
Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Kconfig.debug | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index 00786a142ff0..ebcb6dbc2393 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -20,6 +20,8 @@ config DRM_I915_WERROR config DRM_I915_DEBUG bool "Enable additional driver debugging" depends on DRM_I915 + select PCI_MSI + select INTEL_IOMMU select DEBUG_FS select PREEMPT_COUNT select REFCOUNT_FULL -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/6] iommu/intel: Declare Broadwell igfx dmar support snafu
Despite the widespread and complete failure of Broadwell integrated graphics when DMAR is enabled, known over the years, we have never been able to root cause the issue. Instead, we let the failure undermine our confidence in the iommu system itself when we should be pushing for it to be always enabled. Quirk away Broadwell and remove the rotten apple. References: https://bugs.freedesktop.org/show_bug.cgi?id=89360 Signed-off-by: Chris Wilson Cc: Lu Baolu Cc: Martin Peres Cc: Joerg Roedel --- drivers/iommu/intel-iommu.c | 44 + 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c4e0e4a9ee9e..34f6a3d93ae2 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5690,20 +5690,46 @@ const struct iommu_ops intel_iommu_ops = { .pgsize_bitmap = INTEL_IOMMU_PGSIZES, }; -static void quirk_iommu_g4x_gfx(struct pci_dev *dev) +static void quirk_iommu_igfx(struct pci_dev *dev) { - /* G4x/GM45 integrated gfx dmar support is totally busted. */ pci_info(dev, "Disabling IOMMU for graphics on this chipset\n"); dmar_map_gfx = 0; } -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e10, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e20, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e30, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_g4x_gfx); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_g4x_gfx); +/* G4x/GM45 integrated gfx dmar support is totally busted. */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e10, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e20, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e30, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_igfx); + +/* Broadwell igfx malfunctions with dmar */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1606, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160B, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160E, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1602, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160A, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160D, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1616, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161B, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161E, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1612, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161A, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x161D, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1626, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162B, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162E, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1622, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162A, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x162D, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1636, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163B, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163E, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1632, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163A, quirk_iommu_igfx); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163D, quirk_iommu_igfx); static void quirk_iommu_rwbf(struct pci_dev *dev) { -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/6] drm/i915: Perform GGTT restore much earlier during resume
As soon as we re-enable the various functions within the HW, they may go off and read data via a GGTT offset. Hence, if we have not yet restored the GGTT PTE before then, they may read and even *write* random locations in memory. Detected by DMAR faults during resume. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Martin Peres Cc: Joonas Lahtinen Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/gem/i915_gem_pm.c| 3 --- drivers/gpu/drm/i915/i915_drv.c | 5 + drivers/gpu/drm/i915/selftests/i915_gem.c | 6 ++ 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index b3993d24b83d..9b1129aaacfe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -242,9 +242,6 @@ void i915_gem_resume(struct drm_i915_private *i915) mutex_lock(>drm.struct_mutex); intel_uncore_forcewake_get(>uncore, FORCEWAKE_ALL); - i915_gem_restore_gtt_mappings(i915); - i915_gem_restore_fences(i915); - if (i915_gem_init_hw(i915)) goto err_wedged; diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 7b2c81a8bbaa..1af4eba968c0 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1877,6 +1877,11 @@ static int i915_drm_resume(struct drm_device *dev) if (ret) DRM_ERROR("failed to re-enable GGTT\n"); + mutex_lock(_priv->drm.struct_mutex); + i915_gem_restore_gtt_mappings(dev_priv); + i915_gem_restore_fences(dev_priv); + mutex_unlock(_priv->drm.struct_mutex); + intel_csr_ucode_resume(dev_priv); i915_restore_state(dev_priv); diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c index bb6dd54a6ff3..37593831b539 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c @@ -118,6 +118,12 @@ static void pm_resume(struct drm_i915_private *i915) with_intel_runtime_pm(>runtime_pm, wakeref) { intel_gt_sanitize(>gt, false); i915_gem_sanitize(i915); + + mutex_lock(>drm.struct_mutex); + i915_gem_restore_gtt_mappings(i915); + i915_gem_restore_fences(i915); + mutex_unlock(>drm.struct_mutex); + i915_gem_resume(i915); } } -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/6] iommu/intel: Ignore igfx_off
--- drivers/iommu/intel-iommu.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 34f6a3d93ae2..c98cdfd91691 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -439,8 +439,6 @@ static int __init intel_iommu_setup(char *str) no_platform_optin = 1; pr_info("IOMMU disabled\n"); } else if (!strncmp(str, "igfx_off", 8)) { - dmar_map_gfx = 0; - pr_info("Disable GFX device mapping\n"); } else if (!strncmp(str, "forcedac", 8)) { pr_info("Forcing DAC for PCI devices\n"); dmar_forcedac = 1; -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI
We wish to avoid our presentation worker from being blocked by normal workloads if we want to maintain an interactive frame update. Signed-off-by: Chris Wilson Cc: Ville Syrjälä Cc: Heinrich Fink --- drivers/gpu/drm/i915/display/intel_display.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 4ee750fa3ef0..cb55ab834a07 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -16148,7 +16148,8 @@ int intel_modeset_init(struct drm_device *dev) struct intel_crtc *crtc; int ret; - dev_priv->modeset_wq = alloc_ordered_workqueue("i915_modeset", 0); + dev_priv->modeset_wq = + alloc_ordered_workqueue("i915_modeset", WQ_HIGHPRI); drm_mode_config_init(dev); -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [v2][PATCH 2/3] drm/i915/display: Extract i965_read_luts()
For i965, add hw read out to create hw blob of gamma lut values. Review comments from old series: https://patchwork.freedesktop.org/series/58039/ v4: -No need to initialize *blob [Jani] -Removed right shifts [Jani] -Dropped dev local var [Jani] v5: -Returned blob instead of assigning it internally within the function [Ville] -Renamed i965_get_color_config() to i965_read_lut() [Ville] -Renamed i965_get_gamma_config_10p6() to i965_read_gamma_lut_10p6() [Ville] v9: -Typo and 80 character limit [Uma] -Made read func para as const [Ville, Uma] -Renamed i965_read_gamma_lut_10p6() to i965_read_lut_10p6() [Ville, Uma] v10: -Swapped ldw and udw while creating hw blob [Jani] -Added last index rgb lut value from PIPEGCMAX to h/w blob [Jani] Signed-off-by: Swati Sharma --- drivers/gpu/drm/i915/display/intel_color.c | 50 ++ drivers/gpu/drm/i915/i915_reg.h| 4 +++ 2 files changed, 54 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c index 4d9a568..765f858 100644 --- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -1570,6 +1570,55 @@ static void i9xx_read_luts(struct intel_crtc_state *crtc_state) } static struct drm_property_blob * +i965_read_lut_10p6(const struct intel_crtc_state *crtc_state) +{ + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + u32 lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size; + enum pipe pipe = crtc->pipe; + struct drm_property_blob *blob; + struct drm_color_lut *blob_data; + u32 i, val1, val2; + + blob = drm_property_create_blob(_priv->drm, + sizeof(struct drm_color_lut) * lut_size, + NULL); + if (IS_ERR(blob)) + return NULL; + + blob_data = blob->data; + + for (i = 0; i < lut_size - 1; i++) { + val1 = I915_READ(PALETTE(pipe, 2 * i + 0)); + val2 = I915_READ(PALETTE(pipe, 2 * i + 1)); + + blob_data[i].red = REG_FIELD_GET(PALETTE_RED_MASK, val2) << 8 | + REG_FIELD_GET(PALETTE_RED_MASK, val1); + blob_data[i].green = REG_FIELD_GET(PALETTE_GREEN_MASK, val2) << 8 | + REG_FIELD_GET(PALETTE_GREEN_MASK, val1); + blob_data[i].blue = REG_FIELD_GET(PALETTE_BLUE_MASK, val2) << 8 | + REG_FIELD_GET(PALETTE_BLUE_MASK, val1); + } + + blob_data[i].red = REG_FIELD_GET(PIPEGCMAX_RGB_MASK, +I915_READ(PIPEGCMAX(pipe, 0))); + blob_data[i].green = REG_FIELD_GET(PIPEGCMAX_RGB_MASK, + I915_READ(PIPEGCMAX(pipe, 1))); + blob_data[i].blue = REG_FIELD_GET(PIPEGCMAX_RGB_MASK, + I915_READ(PIPEGCMAX(pipe, 2))); + + return blob; +} + +static void i965_read_luts(struct intel_crtc_state *crtc_state) +{ + if (crtc_state->gamma_mode == GAMMA_MODE_MODE_8BIT) + crtc_state->base.gamma_lut = i9xx_read_lut_8(crtc_state); + else + crtc_state->base.gamma_lut = i965_read_lut_10p6(crtc_state); +} + +static struct drm_property_blob * ilk_read_lut_10(const struct intel_crtc_state *crtc_state) { struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); @@ -1672,6 +1721,7 @@ void intel_color_init(struct intel_crtc *crtc) dev_priv->display.color_check = i9xx_color_check; dev_priv->display.color_commit = i9xx_color_commit; dev_priv->display.load_luts = i965_load_luts; + dev_priv->display.read_luts = i965_read_luts; } else { dev_priv->display.color_check = i9xx_color_check; dev_priv->display.color_commit = i9xx_color_commit; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 45ed96d..5ac8a4d 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3558,6 +3558,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define _PALETTE_A 0xa000 #define _PALETTE_B 0xa800 #define _CHV_PALETTE_C 0xc000 +#define PALETTE_RED_MASKREG_GENMASK(23, 16) +#define PALETTE_GREEN_MASK REG_GENMASK(15, 8) +#define PALETTE_BLUE_MASK REG_GENMASK(7, 0) #define PALETTE(pipe, i) _MMIO(DISPLAY_MMIO_BASE(dev_priv) + \ _PICK((pipe), _PALETTE_A, \ _PALETTE_B, _CHV_PALETTE_C) + \ @@ -5760,6 +5763,7 @@ enum { #define _PIPEAGCMAX
[Intel-gfx] [v2][PATCH 1/3] drm/i915/display: Add gamma precision function for CHV
intel_color_get_gamma_bit_precision() is extended for cherryview by adding chv_gamma_precision(), i965 will use existing i9xx_gamma_precision() func only. Signed-off-by: Swati Sharma Reviewed-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_color.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c index 6d641e1..4d9a568 100644 --- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -1400,6 +1400,14 @@ static int ilk_gamma_precision(const struct intel_crtc_state *crtc_state) } } +static int chv_gamma_precision(const struct intel_crtc_state *crtc_state) +{ + if (crtc_state->cgm_mode & CGM_PIPE_MODE_GAMMA) + return 10; + else + return i9xx_gamma_precision(crtc_state); +} + static int glk_gamma_precision(const struct intel_crtc_state *crtc_state) { switch (crtc_state->gamma_mode) { @@ -1421,12 +1429,17 @@ int intel_color_get_gamma_bit_precision(const struct intel_crtc_state *crtc_stat if (!crtc_state->gamma_enable) return 0; - if (HAS_GMCH(dev_priv) && !IS_CHERRYVIEW(dev_priv)) - return i9xx_gamma_precision(crtc_state); - else if (IS_CANNONLAKE(dev_priv) || IS_GEMINILAKE(dev_priv)) - return glk_gamma_precision(crtc_state); - else if (IS_IRONLAKE(dev_priv)) - return ilk_gamma_precision(crtc_state); + if (HAS_GMCH(dev_priv)) { + if (IS_CHERRYVIEW(dev_priv)) + return chv_gamma_precision(crtc_state); + else + return i9xx_gamma_precision(crtc_state); + } else { + if (IS_CANNONLAKE(dev_priv) || IS_GEMINILAKE(dev_priv)) + return glk_gamma_precision(crtc_state); + else if (IS_IRONLAKE(dev_priv)) + return ilk_gamma_precision(crtc_state); + } return 0; } -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [v2][PATCH 3/3] drm/i915/display: Extract chv_read_luts()
For cherryview, add hw read out to create hw blob of gamma lut values. Review comments from previous series: https://patchwork.freedesktop.org/patch/328252 v4: -No need to initialize *blob [Jani] -Removed right shifts [Jani] -Dropped dev local var [Jani] v5: -Returned blob instead of assigning it internally within the function [Ville] -Renamed function cherryview_get_color_config() to chv_read_luts() -Renamed cherryview_get_gamma_config() to chv_read_cgm_gamma_lut() [Ville] v9: -80 character limit [Uma] -Made read func para as const [Ville, Uma] -Renamed chv_read_cgm_gamma_lut() to chv_read_cgm_gamma_lut() [Ville, Uma] Signed-off-by: Swati Sharma Reviewed-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_color.c | 43 ++ drivers/gpu/drm/i915/i915_reg.h| 3 +++ 2 files changed, 46 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c index 765f858..318308d 100644 --- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -1619,6 +1619,48 @@ static void i965_read_luts(struct intel_crtc_state *crtc_state) } static struct drm_property_blob * +chv_read_cgm_lut(const struct intel_crtc_state *crtc_state) +{ + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + u32 lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size; + enum pipe pipe = crtc->pipe; + struct drm_property_blob *blob; + struct drm_color_lut *blob_data; + u32 i, val; + + blob = drm_property_create_blob(_priv->drm, + sizeof(struct drm_color_lut) * lut_size, + NULL); + if (IS_ERR(blob)) + return NULL; + + blob_data = blob->data; + + for (i = 0; i < lut_size; i++) { + val = I915_READ(CGM_PIPE_GAMMA(pipe, i, 0)); + blob_data[i].green = intel_color_lut_pack(REG_FIELD_GET( + CGM_PIPE_GAMMA_GREEN_MASK, val), 10); + blob_data[i].blue = intel_color_lut_pack(REG_FIELD_GET( + CGM_PIPE_GAMMA_BLUE_MASK, val), 10); + + val = I915_READ(CGM_PIPE_GAMMA(pipe, i, 1)); + blob_data[i].red = intel_color_lut_pack(REG_FIELD_GET( + CGM_PIPE_GAMMA_RED_MASK, val), 10); + } + + return blob; +} + +static void chv_read_luts(struct intel_crtc_state *crtc_state) +{ + if (crtc_state->gamma_mode == GAMMA_MODE_MODE_8BIT) + crtc_state->base.gamma_lut = i9xx_read_lut_8(crtc_state); + else + crtc_state->base.gamma_lut = chv_read_cgm_lut(crtc_state); +} + +static struct drm_property_blob * ilk_read_lut_10(const struct intel_crtc_state *crtc_state) { struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); @@ -1717,6 +1759,7 @@ void intel_color_init(struct intel_crtc *crtc) dev_priv->display.color_check = chv_color_check; dev_priv->display.color_commit = i9xx_color_commit; dev_priv->display.load_luts = chv_load_luts; + dev_priv->display.read_luts = chv_read_luts; } else if (INTEL_GEN(dev_priv) >= 4) { dev_priv->display.color_check = i9xx_color_check; dev_priv->display.color_commit = i9xx_color_commit; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 5ac8a4d..0241c9d 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -10410,6 +10410,9 @@ enum skl_power_gate { #define CGM_PIPE_MODE_GAMMA (1 << 2) #define CGM_PIPE_MODE_CSC(1 << 1) #define CGM_PIPE_MODE_DEGAMMA(1 << 0) +#define CGM_PIPE_GAMMA_RED_MASK REG_GENMASK(9, 0) +#define CGM_PIPE_GAMMA_GREEN_MASK REG_GENMASK(25, 16) +#define CGM_PIPE_GAMMA_BLUE_MASK REG_GENMASK(9, 0) #define _CGM_PIPE_B_CSC_COEFF01(VLV_DISPLAY_BASE + 0x69900) #define _CGM_PIPE_B_CSC_COEFF23(VLV_DISPLAY_BASE + 0x69904) -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [v2][PATCH 0/3] adding gamma state checker for CHV and i965
In this patch series, added state checker to validate gamma lut values for cherryview and i965 platforms. It's extension of the patch series https://patchwork.freedesktop.org/patch/328246/?series=58039 which enabled the basic infrastructure and state checker for few legacy platforms. v2: Added last index rgb lut value from PIPEGCMAX to h/w blob [Jani] Swati Sharma (3): drm/i915/display: Add gamma precision function for CHV drm/i915/display: Extract i965_read_luts() drm/i915/display: Extract chv_read_luts() drivers/gpu/drm/i915/display/intel_color.c | 118 +++-- drivers/gpu/drm/i915/i915_reg.h| 7 ++ 2 files changed, 119 insertions(+), 6 deletions(-) -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 04/11] drm/i915/dsb: Indexed register write function for DSB.
On 9/7/2019 4:37 PM, Animesh Manna wrote: DSB can program large set of data through indexed register write (opcode 0x9) in one shot. DSB feature can be used for bulk register programming e.g. gamma lut programming, HDR meta data programming. v1: initial version. v2: simplified code by using ALIGN(). (Chris) v3: ascii table added as code comment. (Shashank) Cc: Shashank Sharma Cc: Imre Deak Cc: Jani Nikula Cc: Rodrigo Vivi Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 64 drivers/gpu/drm/i915/display/intel_dsb.h | 8 +++ 2 files changed, 72 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 150be81fdfb3..0f55ed683d41 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -15,6 +15,7 @@ #define DSB_OPCODE_INDEXED_WRITE 0x9 #define DSB_BYTE_EN 0xF #define DSB_BYTE_EN_SHIFT 20 +#define DSB_REG_VALUE_MASK 0xf struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc) @@ -77,6 +78,69 @@ void intel_dsb_put(struct intel_dsb *dsb) } } +void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg, +u32 val) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + u32 *buf = dsb->cmd_buf; + u32 reg_val; + + if (!buf) { + I915_WRITE(reg, val); + return; + } + + if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) { + DRM_DEBUG_KMS("DSB buffer overflow.\n"); Again, '.' in the end can be removed + return; + } + + /* +* For example the buffer will look like below for 3 dwords for auto +* increment register: +* ++ +* | size = 3 | offset &| value1 | value2 | value3 | zero | +* | | opcode ||||| +* ++ +* + + +++++ +* 0 4 812 16 20 24 +* Byte +* +* As every instruction is 8 byte aligned the index of dsb instruction +* will start always from even number while dealing with u32 array and +* zero to be added for odd number of dwords at the last. Let's split this comment in two parts, to make even more useful, like: "As every instruction . array". "If we are writing odd no of dwords, Zeros will be added in the end for padding." - Shashank +*/ + reg_val = buf[dsb->ins_start_offset + 1] & DSB_REG_VALUE_MASK; + if (reg_val != i915_mmio_reg_offset(reg)) { + /* Every instruction should be 8 byte aligned. */ + dsb->free_pos = ALIGN(dsb->free_pos, 2); + + dsb->ins_start_offset = dsb->free_pos; + + /* Update the size. */ + buf[dsb->free_pos++] = 1; + + /* Update the opcode and reg. */ + buf[dsb->free_pos++] = (DSB_OPCODE_INDEXED_WRITE << + DSB_OPCODE_SHIFT) | + i915_mmio_reg_offset(reg); + + /* Update the value. */ + buf[dsb->free_pos++] = val; + } else { + /* Update the new value. */ + buf[dsb->free_pos++] = val; + + /* Update the size. */ + buf[dsb->ins_start_offset]++; + } + + /* if number of data words is odd, then the last dword should be 0.*/ + if (dsb->free_pos & 0x1) + buf[dsb->free_pos] = 0; +} + void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val) { struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h index 31b87dcfe160..9b2522f20bfb 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.h +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -29,11 +29,19 @@ struct intel_dsb { * and help in calculating tail of command buffer. */ int free_pos; + + /* +* ins_start_offset will help to store start address +* of the dsb instuction of auto-increment register. +*/ + u32 ins_start_offset; }; struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc); void intel_dsb_put(struct intel_dsb *dsb); void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val); +void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg, +u32 val); #endif ___ Intel-gfx mailing list
Re: [Intel-gfx] [PATCH v5 07/11] drm/i915/dsb: function to trigger workload execution of DSB.
On 9/7/2019 4:37 PM, Animesh Manna wrote: Batch buffer will be created through dsb-reg-write function which can have single/multiple request based on usecase and once the buffer is ready commit function will trigger the execution of the batch buffer. All the registers will be updated simultaneously. v1: Initial version. v2: Optimized code few places. (Chris) v3: USed DRM_ERROR for dsb head/tail programming failure. (Shashank) Cc: Imre Deak Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 42 drivers/gpu/drm/i915/display/intel_dsb.h | 1 + drivers/gpu/drm/i915/i915_reg.h | 2 ++ 3 files changed, 45 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 56bf41b00f62..853685751540 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -213,3 +213,45 @@ void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val) (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) | i915_mmio_reg_offset(reg); } + +void intel_dsb_commit(struct intel_dsb *dsb) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_device *dev = crtc->base.dev; + struct drm_i915_private *dev_priv = to_i915(dev); + enum pipe pipe = crtc->pipe; + u32 tail; + + if (!dsb->free_pos) I am seeing that in both success and failure cases, we are not returning anything. We have some error messages, but I would still like the caller to know if the commit was successful or not, with a return value. Do you think so Jani? - Shashank + return; + + if (!intel_dsb_enable_engine(dsb)) + goto reset; + + if (is_dsb_busy(dsb)) { + DRM_ERROR("HEAD_PTR write failed - dsb engine is busy.\n"); + goto reset; + } + I915_WRITE(DSB_HEAD(pipe, dsb->id), i915_ggtt_offset(dsb->vma)); + + tail = ALIGN(dsb->free_pos * 4, CACHELINE_BYTES); + if (tail > dsb->free_pos * 4) + memset(>cmd_buf[dsb->free_pos], 0, + (tail - dsb->free_pos * 4)); + + if (is_dsb_busy(dsb)) { + DRM_ERROR("TAIL_PTR write failed - dsb engine is busy.\n"); + goto reset; + } + DRM_DEBUG_KMS("DSB execution started - head 0x%x, tail 0x%x\n", + i915_ggtt_offset(dsb->vma), tail); + I915_WRITE(DSB_TAIL(pipe, dsb->id), i915_ggtt_offset(dsb->vma) + tail); + if (wait_for(!is_dsb_busy(dsb), 1)) { + DRM_ERROR("Timed out waiting for DSB workload completion.\n"); + goto reset; + } + +reset: + dsb->free_pos = 0; + intel_dsb_disable_engine(dsb); +} diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h index 9b2522f20bfb..7389c8c5b665 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.h +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -43,5 +43,6 @@ void intel_dsb_put(struct intel_dsb *dsb); void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val); void intel_dsb_indexed_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val); +void intel_dsb_commit(struct intel_dsb *dsb); #endif diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2df01386e3de..cfb78a2f94fe 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -11680,6 +11680,8 @@ enum skl_power_gate { #define _DSBSL_INSTANCE_BASE 0x70B00 #define DSBSL_INSTANCE(pipe, id) (_DSBSL_INSTANCE_BASE + \ (pipe) * 0x1000 + (id) * 100) +#define DSB_HEAD(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x0) +#define DSB_TAIL(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x4) #define DSB_CTRL(pipe, id)_MMIO(DSBSL_INSTANCE(pipe, id) + 0x8) #define DSB_ENABLE (1 << 31) #define DSB_STATUS (1 << 0) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Show the logical context ring state on dumping
== Series Details == Series: series starting with [1/2] drm/i915: Show the logical context ring state on dumping URL : https://patchwork.freedesktop.org/series/66422/ State : success == Summary == CI Bug Log - changes from CI_DRM_6853 -> Patchwork_14323 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/ New tests - New tests have been introduced between CI_DRM_6853 and Patchwork_14323: ### New IGT tests (1) ### * igt@i915_selftest@live_gt_lrc: - Statuses : 45 pass(s) - Exec time: [0.38, 2.02] s Known issues Here are the changes found in Patchwork_14323 that come from known issues: ### IGT changes ### Issues hit * igt@gem_ctx_exec@basic: - fi-icl-u3: [PASS][1] -> [DMESG-WARN][2] ([fdo#107724]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@gem_ctx_e...@basic.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@gem_ctx_e...@basic.html * igt@i915_module_load@reload: - fi-icl-u3: [PASS][3] -> [DMESG-WARN][4] ([fdo#107724] / [fdo#111214]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@i915_module_l...@reload.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@i915_module_l...@reload.html * igt@kms_chamelium@dp-edid-read: - fi-cml-u2: [PASS][5] -> [FAIL][6] ([fdo#109483]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-cml-u2/igt@kms_chamel...@dp-edid-read.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-cml-u2/igt@kms_chamel...@dp-edid-read.html * igt@prime_vgem@basic-fence-flip: - fi-ilk-650: [PASS][7] -> [DMESG-WARN][8] ([fdo#106387]) +1 similar issue [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-ilk-650/igt@prime_v...@basic-fence-flip.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-ilk-650/igt@prime_v...@basic-fence-flip.html Possible fixes * igt@gem_mmap_gtt@basic-write-read: - fi-icl-u3: [DMESG-WARN][9] ([fdo#107724]) -> [PASS][10] +2 similar issues [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6853/fi-icl-u3/igt@gem_mmap_...@basic-write-read.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/fi-icl-u3/igt@gem_mmap_...@basic-write-read.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167 [fdo#106387]: https://bugs.freedesktop.org/show_bug.cgi?id=106387 [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724 [fdo#109483]: https://bugs.freedesktop.org/show_bug.cgi?id=109483 [fdo#111214]: https://bugs.freedesktop.org/show_bug.cgi?id=111214 [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593 Participating hosts (53 -> 47) -- Additional (1): fi-kbl-soraka Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_6853 -> Patchwork_14323 CI-20190529: 20190529 CI_DRM_6853: ad1a8a60aba111d2c186d19391d5a17bd09ab48b @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_14323: 4fe867b7f2167bd9534b401eeba31c56b2ecaeed @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 4fe867b7f216 drm/i915/selftests: Verify the LRC register layout between init and HW 3f29fa89a00e drm/i915: Show the logical context ring state on dumping == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14323/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] system freeze on i915 system(s) due to commit aa56a292ce623734ddd30f52d73f527d1f3529b5
‐‐‐ Original Message ‐‐‐ On Monday, September 9, 2019 10:38 AM, wrote: > With commit aa56a292ce623734ddd30f52d73f527d1f3529b5 (even on 5.3.0-rc8) I > can get a system freeze during chromium compilation (likely due to jumbo / > high memory usage). Sysrq still works and CPU/fan is low, so it seems like a > deadlock? and there's no disk reading. I can't read the dump gotten via kdump > for some reason, else I would've shown a stacktrace by causing kernel to > crash via sysrq+c. > > I can easily reproduce this freeze in a matter of seconds: > > please see https://bugzilla.kernel.org/show_bug.cgi?id=203317#c4 > > Thanks. Filed https://bugs.freedesktop.org/show_bug.cgi?id=111601 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/ringbuffer: Flush writes before RING_TAIL update
Be paranoid and make sure we flush any and all writes out of the WCB before performing the UC mmio to update the RING_TAIL. (An UC write should itself be enough to do the flush, hence the paranoia here.) Quite infrequently, we see problems where the GPU seems to overshoot the RING_TAIL and so executes garbage hence the speculation. References: https://bugs.freedesktop.org/show_bug.cgi?id=111598 References: https://bugs.freedesktop.org/show_bug.cgi?id=111417 References: https://bugs.freedesktop.org/show_bug.cgi?id=111034 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index bbda85dcaa42..73c3ffc80218 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -930,6 +930,7 @@ static void cancel_requests(struct intel_engine_cs *engine) static void i9xx_submit_request(struct i915_request *request) { i915_request_submit(request); + wmb(); /* paranoid flush writes out of the WCB before mmio */ ENGINE_WRITE(request->engine, RING_TAIL, intel_ring_set_tail(request->ring, request->tail)); -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2
== Series Details == Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2 URL : https://patchwork.freedesktop.org/series/66418/ State : failure == Summary == CI Bug Log - changes from CI_DRM_6852_full -> Patchwork_14322_full Summary --- **FAILURE** Serious unknown changes coming with Patchwork_14322_full absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_14322_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_14322_full: ### IGT changes ### Possible regressions * igt@kms_universal_plane@cursor-fb-leak-pipe-a: - shard-snb: [PASS][1] -> [FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-snb4/igt@kms_universal_pl...@cursor-fb-leak-pipe-a.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-snb5/igt@kms_universal_pl...@cursor-fb-leak-pipe-a.html * igt@perf@enable-disable: - shard-kbl: [PASS][3] -> [DMESG-WARN][4] +3 similar issues [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-kbl1/igt@p...@enable-disable.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-kbl1/igt@p...@enable-disable.html * igt@perf@gen8-unprivileged-single-ctx-counters: - shard-apl: [PASS][5] -> [DMESG-WARN][6] +1 similar issue [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-apl5/igt@p...@gen8-unprivileged-single-ctx-counters.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-apl1/igt@p...@gen8-unprivileged-single-ctx-counters.html - shard-glk: [PASS][7] -> [DMESG-WARN][8] +2 similar issues [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-glk2/igt@p...@gen8-unprivileged-single-ctx-counters.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-glk8/igt@p...@gen8-unprivileged-single-ctx-counters.html - shard-iclb: [PASS][9] -> [DMESG-WARN][10] +1 similar issue [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb8/igt@p...@gen8-unprivileged-single-ctx-counters.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb5/igt@p...@gen8-unprivileged-single-ctx-counters.html * igt@perf@mi-rpc: - shard-skl: NOTRUN -> [DMESG-WARN][11] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-skl3/igt@p...@mi-rpc.html Known issues Here are the changes found in Patchwork_14322_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_schedule@preempt-other-chain-bsd: - shard-iclb: [PASS][12] -> [SKIP][13] ([fdo#111325]) +5 similar issues [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb3/igt@gem_exec_sched...@preempt-other-chain-bsd.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb4/igt@gem_exec_sched...@preempt-other-chain-bsd.html * igt@i915_pm_rpm@system-suspend-modeset: - shard-iclb: [PASS][14] -> [INCOMPLETE][15] ([fdo#107713] / [fdo#108840]) [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb6/igt@i915_pm_...@system-suspend-modeset.html [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb2/igt@i915_pm_...@system-suspend-modeset.html * igt@i915_suspend@fence-restore-tiled2untiled: - shard-apl: [PASS][16] -> [DMESG-WARN][17] ([fdo#108566]) +6 similar issues [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-apl8/igt@i915_susp...@fence-restore-tiled2untiled.html [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-apl7/igt@i915_susp...@fence-restore-tiled2untiled.html * igt@kms_cursor_crc@pipe-c-cursor-64x21-sliding: - shard-iclb: [PASS][18] -> [INCOMPLETE][19] ([fdo#107713]) [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-iclb6/igt@kms_cursor_...@pipe-c-cursor-64x21-sliding.html [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-iclb7/igt@kms_cursor_...@pipe-c-cursor-64x21-sliding.html * igt@kms_cursor_legacy@cursor-vs-flip-toggle: - shard-hsw: [PASS][20] -> [INCOMPLETE][21] ([fdo#103540]) [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/shard-hsw8/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/shard-hsw6/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html * igt@kms_draw_crc@draw-method-rgb565-pwrite-untiled: - shard-snb: [PASS][22] -> [SKIP][23] ([fdo#109271]) +3 similar issues [22]:
Re: [Intel-gfx] [PATCH v5 03/11] drm/i915/dsb: single register write function for DSB.
On 9/7/2019 4:37 PM, Animesh Manna wrote: DSB support single register write through opcode 0x1. Generic api created which accumulate all single register write in a batch buffer and once DSB is triggered, it will program all the registers at the same time. v1: Initial version. v2: Unused macro removed and cosmetic changes done. (Shashank) Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 30 drivers/gpu/drm/i915/display/intel_dsb.h | 9 +++ 2 files changed, 39 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index cba5c8d37659..150be81fdfb3 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -9,6 +9,13 @@ #define DSB_BUF_SIZE(2 * PAGE_SIZE) +/* DSB opcodes. */ +#define DSB_OPCODE_SHIFT 24 +#define DSB_OPCODE_MMIO_WRITE 0x1 +#define DSB_OPCODE_INDEXED_WRITE 0x9 +#define DSB_BYTE_EN0xF +#define DSB_BYTE_EN_SHIFT 20 + struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc) { @@ -46,6 +53,7 @@ intel_dsb_get(struct intel_crtc *crtc) goto err; } dsb->vma = vma; + dsb->free_pos = 0; This should be done in dsb_put(); err: intel_runtime_pm_put(>runtime_pm, wakeref); @@ -68,3 +76,25 @@ void intel_dsb_put(struct intel_dsb *dsb) mutex_unlock(>drm.struct_mutex); } } + +void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + u32 *buf = dsb->cmd_buf; + + if (!buf) { + I915_WRITE(reg, val); + return; + } + + if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) { + DRM_DEBUG_KMS("DSB buffer overflow.\n"); Lets remove this '.' in the end, to maintain consistency in the log. - Shashank + return; + } + + buf[dsb->free_pos++] = val; + buf[dsb->free_pos++] = (DSB_OPCODE_MMIO_WRITE << DSB_OPCODE_SHIFT) | + (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) | + i915_mmio_reg_offset(reg); +} diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h index 27eb68eb5392..31b87dcfe160 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.h +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -6,6 +6,8 @@ #ifndef _INTEL_DSB_H #define _INTEL_DSB_H +#include "i915_reg.h" + struct intel_crtc; struct i915_vma; @@ -21,10 +23,17 @@ struct intel_dsb { enum dsb_id id; u32 *cmd_buf; struct i915_vma *vma; + + /* +* free_pos will point the first free entry position +* and help in calculating tail of command buffer. +*/ + int free_pos; }; struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc); void intel_dsb_put(struct intel_dsb *dsb); +void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val); #endif ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: cleanup cache-coloring
Quoting Matthew Auld (2019-09-09 13:40:52) > Try to tidy up the cache-coloring such that we rid the code of any > mm.color_adjust assumptions, this should hopefully make it more obvious > in the code when we need to actually use the cache-level as the color, > and as a bonus should make adding a different color-scheme simpler. > > Signed-off-by: Matthew Auld > Cc: Chris Wilson > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi Series is Reviewed-by: Chris Wilson -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/tgl: Add sysfs interface to control class-of-service
Quoting Prathap Kumar Valsan (2019-08-26 01:48:01) > To provide shared last-level-cache isolation to cpu workloads running > concurrently with gpu workloads, the gpu allocation of cache lines needs > to be restricted to certain ways. Currently GPU hardware supports four > class-of-service(CLOS) levels and there is an associated way-mask for > each CLOS. > > Hardware supports reading supported way-mask configuration for GPU using > a bios pcode interface. The supported way-masks and the one currently > active is communicated to userspace via a sysfs file--closctrl. Admin user > can then select a new mask by writing the mask value to the file. > > Note of Caution: Restricting cache ways using this mechanism presents a > larger attack surface for side-channel attacks. I wonder if this is enough to justify some further protection before enabling? > Example usage: > The active way-mask is highlighted within square brackets. > > cat /sys/class/drm/card0/closctrl > [0x] 0xff00 0xc000 0x8000 How about two files for easier scripting interface? /sys/class/drm/card0/llc_clos /sys/class/drm/card0/llc_clos_modes Regards, Joonas ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/execlists: Remove incorrect BUG_ON for schedule-out
Quoting Tvrtko Ursulin (2019-09-09 11:23:56) > > On 07/09/2019 11:50, Chris Wilson wrote: > > As we may unwind incomplete requests (for preemption) prior to > > processing the CSB and the schedule-out events, we may update rq->engine > > (resetting it to point back to the parent virtual engine) prior to > > calling execlists_schedule_out(), invalidating the assertion that the > > request still points to the inflight engine. (The likelihood of this is > > increased if the CSB interrupt processing is pushed to the ksoftirqd for > > being too slow and direct submission overtakes it.) > > > > Reported-by: Vinay Belgaumkar > > Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the > > irq-off spinlock") > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > Cc: Tvrtko Ursulin > > Cc: Vinay Belgaumkar > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > > b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 3aad35b570d4..16f226349525 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -631,7 +631,6 @@ execlists_schedule_out(struct i915_request *rq) > > struct intel_engine_cs *cur, *old; > > > > trace_i915_request_out(rq); > > - GEM_BUG_ON(intel_context_inflight(ce) != rq->engine); > > > > old = READ_ONCE(ce->inflight); > > do > > > > So unwind from direct submission resets rq->engine and races with > process_csb from the tasklet which notices request has actually > completed? Yup. That's nice and succinct compared to my waffle. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915: Show the logical context ring state on dumping
== Series Details == Series: series starting with [1/2] drm/i915: Show the logical context ring state on dumping URL : https://patchwork.freedesktop.org/series/66422/ State : warning == Summary == $ dim checkpatch origin/drm-tip 3f29fa89a00e drm/i915: Show the logical context ring state on dumping 4fe867b7f216 drm/i915/selftests: Verify the LRC register layout between init and HW -:60: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'x' - possible side-effects? #60: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:473: +#define REG(x) (((x) >> 2) | BUILD_BUG_ON_ZERO(x >= 0x200)) -:61: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses #61: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:474: +#define REG16(x) \ + (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \ + (((x) >> 2) & 0x7f) -:61: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'x' - possible side-effects? #61: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:474: +#define REG16(x) \ + (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \ + (((x) >> 2) & 0x7f) total: 1 errors, 0 warnings, 2 checks, 1085 lines checked ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/execlists: Remove incorrect BUG_ON for schedule-out
On 07/09/2019 11:50, Chris Wilson wrote: As we may unwind incomplete requests (for preemption) prior to processing the CSB and the schedule-out events, we may update rq->engine (resetting it to point back to the parent virtual engine) prior to calling execlists_schedule_out(), invalidating the assertion that the request still points to the inflight engine. (The likelihood of this is increased if the CSB interrupt processing is pushed to the ksoftirqd for being too slow and direct submission overtakes it.) Reported-by: Vinay Belgaumkar Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Tvrtko Ursulin Cc: Vinay Belgaumkar --- drivers/gpu/drm/i915/gt/intel_lrc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 3aad35b570d4..16f226349525 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -631,7 +631,6 @@ execlists_schedule_out(struct i915_request *rq) struct intel_engine_cs *cur, *old; trace_i915_request_out(rq); - GEM_BUG_ON(intel_context_inflight(ce) != rq->engine); old = READ_ONCE(ce->inflight); do So unwind from direct submission resets rq->engine and races with process_csb from the tasklet which notices request has actually completed? Seems to hold true in code. Reviewed-by: Tvrtko Ursulin Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] system freeze on i915 system(s) due to commit aa56a292ce623734ddd30f52d73f527d1f3529b5
With commit aa56a292ce623734ddd30f52d73f527d1f3529b5 (even on 5.3.0-rc8) I can get a system freeze during chromium compilation (likely due to jumbo / high memory usage). Sysrq still works and CPU/fan is low, so it seems like a deadlock? and there's no disk reading. I can't read the dump gotten via kdump for some reason, else I would've shown a stacktrace by causing kernel to crash via sysrq+c. I can easily reproduce this freeze in a matter of seconds: please see https://bugzilla.kernel.org/show_bug.cgi?id=203317#c4 Thanks. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 05/11] drm/i915/dsb: Check DSB engine status.
On 9/7/2019 4:37 PM, Animesh Manna wrote: As per bspec check for DSB status before programming any of its register. Inline function added to check the dsb status. Cc: Michel Thierry Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 9 + drivers/gpu/drm/i915/i915_reg.h | 7 +++ 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 0f55ed683d41..2c8415518c65 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -17,6 +17,15 @@ #define DSB_BYTE_EN_SHIFT 20 #define DSB_REG_VALUE_MASK0xf +static inline bool is_dsb_busy(struct intel_dsb *dsb) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + enum pipe pipe = crtc->pipe; + + return DSB_STATUS & I915_READ(DSB_CTRL(pipe, dsb->id)); +} + struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc) { diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 006cffd56be2..a3099f712ae6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -11676,4 +11676,11 @@ enum skl_power_gate { #define PORT_TX_DFLEXDPCSSS(fia) _MMIO_FIA((fia), 0x00894) #define DP_PHY_MODE_STATUS_NOT_SAFE(tc_port)(1 << (tc_port)) +/* This register controls the Display State Buffer (DSB) engines. */ +#define _DSBSL_INSTANCE_BASE 0x70B00 +#define DSBSL_INSTANCE(pipe, id) (_DSBSL_INSTANCE_BASE + \ +(pipe) * 0x1000 + (id) * 100) Why is pipe in () ? - Shashank +#define DSB_CTRL(pipe, id) _MMIO(DSBSL_INSTANCE(pipe, id) + 0x8) +#define DSB_STATUS (1 << 0) + #endif /* _I915_REG_H_ */ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 06/11] drm/i915/dsb: functions to enable/disable DSB engine.
On 9/7/2019 4:37 PM, Animesh Manna wrote: DSB will be used for performance improvement for some special scenario. DSB engine will be enabled based on need and after completion of its work will be disabled. Api added for enable/disable operation by using DSB_CTRL register. v1: Initial version. v2: POSTING_READ added after writing control register. (Shashank) Cc: Michel Thierry Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 42 drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 43 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 2c8415518c65..56bf41b00f62 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -26,6 +26,48 @@ static inline bool is_dsb_busy(struct intel_dsb *dsb) return DSB_STATUS & I915_READ(DSB_CTRL(pipe, dsb->id)); } +static inline bool intel_dsb_enable_engine(struct intel_dsb *dsb) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + enum pipe pipe = crtc->pipe; + u32 dsb_ctrl; + + dsb_ctrl = I915_READ(DSB_CTRL(pipe, dsb->id)); + This space not required. + if (DSB_STATUS & dsb_ctrl) { + DRM_DEBUG_KMS("DSB engine is busy.\n"); + return false; + } + + dsb_ctrl |= DSB_ENABLE; + I915_WRITE(DSB_CTRL(pipe, dsb->id), dsb_ctrl); + + POSTING_READ(DSB_CTRL(pipe, dsb->id)); + return true; +} + +static inline bool intel_dsb_disable_engine(struct intel_dsb *dsb) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + enum pipe pipe = crtc->pipe; + u32 dsb_ctrl; + + dsb_ctrl = I915_READ(DSB_CTRL(pipe, dsb->id)); + Same here. + if (DSB_STATUS & dsb_ctrl) { + DRM_DEBUG_KMS("DSB engine is busy.\n"); + return false; + } + + dsb_ctrl &= ~DSB_ENABLE; + I915_WRITE(DSB_CTRL(pipe, dsb->id), dsb_ctrl); + + POSTING_READ(DSB_CTRL(pipe, dsb->id)); + return true; +} + struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc) { diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a3099f712ae6..2df01386e3de 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -11681,6 +11681,7 @@ enum skl_power_gate { #define DSBSL_INSTANCE(pipe, id) (_DSBSL_INSTANCE_BASE + \ (pipe) * 0x1000 + (id) * 100) #define DSB_CTRL(pipe, id)_MMIO(DSBSL_INSTANCE(pipe, id) + 0x8) +#define DSB_ENABLE (1 << 31) #define DSB_STATUS (1 << 0) #endif /* _I915_REG_H_ */ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 08/11] drm/i915/dsb: added dsb refcount to synchronize between get/put.
On 9/7/2019 4:37 PM, Animesh Manna wrote: The lifetime of command buffer can be controlled by the dsb user throuh refcount. Added refcount mechanism is dsb get/put call which create/destroy dsb context. Cc: Jani Nikula Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 22 -- drivers/gpu/drm/i915/display/intel_dsb.h | 1 + 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 853685751540..b951a6b5264a 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -78,7 +78,12 @@ intel_dsb_get(struct intel_crtc *crtc) struct intel_dsb *dsb = >dsb; intel_wakeref_t wakeref; - if ((!HAS_DSB(i915)) || dsb->cmd_buf) + if (!HAS_DSB(i915)) + return dsb; + + atomic_inc(>refcount); + As discussed we are not solving any problem with reference counting, rather, we are adding a complexity here. It may be useful, when we are extending single instance of DSB to DSB pool but not right now. I would say we drop this patch all together, and just have the simple implementation now. - Shashank + if (dsb->cmd_buf) return dsb; dsb->id = DSB1; @@ -94,6 +99,7 @@ intel_dsb_get(struct intel_crtc *crtc) if (IS_ERR(vma)) { DRM_ERROR("Vma creation failed.\n"); i915_gem_object_put(obj); + atomic_dec(>refcount); goto err; } @@ -102,6 +108,7 @@ intel_dsb_get(struct intel_crtc *crtc) DRM_ERROR("Command buffer creation failed.\n"); i915_vma_unpin_and_release(, 0); dsb->cmd_buf = NULL; + atomic_dec(>refcount); goto err; } dsb->vma = vma; @@ -121,11 +128,14 @@ void intel_dsb_put(struct intel_dsb *dsb) return; if (dsb->cmd_buf) { - mutex_lock(>drm.struct_mutex); - i915_gem_object_unpin_map(dsb->vma->obj); - i915_vma_unpin_and_release(>vma, 0); - dsb->cmd_buf = NULL; - mutex_unlock(>drm.struct_mutex); + atomic_dec(>refcount); + if (!atomic_read(>refcount)) { + mutex_lock(>drm.struct_mutex); + i915_gem_object_unpin_map(dsb->vma->obj); + i915_vma_unpin_and_release(>vma, 0); + dsb->cmd_buf = NULL; + mutex_unlock(>drm.struct_mutex); + } } } diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h index 7389c8c5b665..dca4e632dd3c 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.h +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -20,6 +20,7 @@ enum dsb_id { }; struct intel_dsb { + atomic_t refcount; enum dsb_id id; u32 *cmd_buf; struct i915_vma *vma; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4 3/7] drm: Add DisplayPort colorspace property
On Sat, Sep 07, 2019 at 11:19:55PM +, Mun, Gwan-gyeong wrote: > On Fri, 2019-09-06 at 09:24 -0400, Ilia Mirkin wrote: > > On Fri, Sep 6, 2019 at 7:43 AM Ville Syrjälä > > wrote: > > > On Fri, Sep 06, 2019 at 11:31:55AM +, Shankar, Uma wrote: > > > > > > > > > -Original Message- > > > > > From: Ilia Mirkin > > > > > Sent: Tuesday, September 3, 2019 6:12 PM > > > > > To: Mun, Gwan-gyeong > > > > > Cc: Intel Graphics Development > > > > >; Shankar, Uma > > > > > ; dri-devel < > > > > > dri-de...@lists.freedesktop.org> > > > > > Subject: Re: [PATCH v4 3/7] drm: Add DisplayPort colorspace > > > > > property > > > > > > > > > > So how would this work with a DP++ connector? Should it list > > > > > the HDMI or DP > > > > > properties? Or do we need a custom property checker which is > > > > > aware of what is > > > > > currently plugged in to validate the values? > > > > > > > > AFAIU For DP++ cases, we detect what kind of sink its driving DP > > > > or HDMI (with a passive dongle). > > > > Based on the type of sink detected, we should expose DP or HDMI > > > > colorspaces to userspace. > > > > > > For i915 DP connector always drives DP mode, HDMI connector always > > > drives > > > HDMI mode, even when the physical connector is DP++. > > > > Right, i915 creates 2 connectors, while nouveau, radeon, and amdgpu > > create 1 connector (not sure about other drivers) for a single > > physical DP++ socket. Since we supply the list of valid values at the > > time of creating the connector, we can't know at that point whether > > in > > the future a HDMI or DP will be plugged into it. > > > > -ilia > Ilia, does it mean that the drm_connector type is > DRM_MODE_CONNECTOR_DisplayPort and protocol is DP++ mode? > > And Ville and Uma, when we are useing dp active dongle (DP to HDMI > dongle and DP branch device is HDMI) should we expose HDMI colorspace? We still set it up via DP MSA/VSC no? In that case it should follow the DP spec I think. LSPCON is probably different because we manually generate the AVI infoframe for it. But I'm not sure how we're going to reconcile that with the DP stuff we also set up for it. -- Ville Syrjälä Intel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust
Make it clear that the color adjust callback applies to the ggtt. Signed-off-by: Matthew Auld Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_gem_gtt.c | 10 +- drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 095f5e358a58..48688d683e95 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2547,10 +2547,10 @@ static int ggtt_set_pages(struct i915_vma *vma) return 0; } -static void i915_gtt_color_adjust(const struct drm_mm_node *node, - unsigned long color, - u64 *start, - u64 *end) +static void i915_ggtt_color_adjust(const struct drm_mm_node *node, + unsigned long color, + u64 *start, + u64 *end) { if (i915_node_color_differs(node, color)) *start += I915_GTT_PAGE_SIZE; @@ -3206,7 +3206,7 @@ static int ggtt_init_hw(struct i915_ggtt *ggtt) ggtt->vm.has_read_only = IS_VALLEYVIEW(i915); if (!HAS_LLC(i915) && !HAS_PPGTT(i915)) - ggtt->vm.mm.color_adjust = i915_gtt_color_adjust; + ggtt->vm.mm.color_adjust = i915_ggtt_color_adjust; if (!io_mapping_init_wc(>iomap, ggtt->gmadr.start, diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c index cb30c669b1b7..fca38167bdce 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c @@ -276,7 +276,7 @@ static int igt_evict_for_cache_color(void *arg) /* Currently the use of color_adjust is limited to cache domains within * the ggtt, and so the presence of mm.color_adjust is assumed to be -* i915_gtt_color_adjust throughout our driver, so using a mock color +* i915_ggtt_color_adjust throughout our driver, so using a mock color * adjust will work just fine for our purposes. */ ggtt->vm.mm.color_adjust = mock_color_adjust; -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] drm/i915: cleanup cache-coloring
Try to tidy up the cache-coloring such that we rid the code of any mm.color_adjust assumptions, this should hopefully make it more obvious in the code when we need to actually use the cache-level as the color, and as a bonus should make adding a different color-scheme simpler. Signed-off-by: Matthew Auld Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/gem/i915_gem_domain.c| 6 +++-- drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem_evict.c | 12 +- drivers/gpu/drm/i915/i915_gem_gtt.h | 6 + drivers/gpu/drm/i915/i915_vma.c | 22 +-- drivers/gpu/drm/i915/i915_vma.h | 2 +- .../gpu/drm/i915/selftests/i915_gem_evict.c | 10 + 7 files changed, 34 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 6af740a5e3db..da3e7cf12aa1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -294,8 +294,10 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, } } - list_for_each_entry(vma, >vma.list, obj_link) - vma->node.color = cache_level; + list_for_each_entry(vma, >vma.list, obj_link) { + if (i915_vm_has_cache_coloring(vma->vm)) + vma->node.color = cache_level; + } i915_gem_object_set_cache_coherency(obj, cache_level); obj->cache_dirty = true; /* Always invalidate stale cachelines */ diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index db7480831e52..e289b4ffd34b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2364,7 +2364,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, - unsigned cache_level, + unsigned long color, u64 start, u64 end, unsigned flags); int __must_check i915_gem_evict_for_node(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 52c86c6e0673..e76c9da9992d 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -70,7 +70,7 @@ mark_free(struct drm_mm_scan *scan, * @vm: address space to evict from * @min_size: size of the desired free space * @alignment: alignment constraint of the desired free space - * @cache_level: cache_level for the desired space + * @color: color for the desired space * @start: start (inclusive) of the range from which to evict objects * @end: end (exclusive) of the range from which to evict objects * @flags: additional flags to control the eviction algorithm @@ -91,7 +91,7 @@ mark_free(struct drm_mm_scan *scan, int i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, -unsigned cache_level, +unsigned long color, u64 start, u64 end, unsigned flags) { @@ -124,7 +124,7 @@ i915_gem_evict_something(struct i915_address_space *vm, if (flags & PIN_MAPPABLE) mode = DRM_MM_INSERT_LOW; drm_mm_scan_init_with_range(, >mm, - min_size, alignment, cache_level, + min_size, alignment, color, start, end, mode); /* @@ -266,7 +266,6 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, u64 start = target->start; u64 end = start + target->size; struct i915_vma *vma, *next; - bool check_color; int ret = 0; lockdep_assert_held(>i915->drm.struct_mutex); @@ -283,8 +282,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, if (!(flags & PIN_NONBLOCK)) i915_retire_requests(vm->i915); - check_color = vm->mm.color_adjust; - if (check_color) { + if (i915_vm_has_cache_coloring(vm)) { /* Expand search to cover neighbouring guard pages (or lack!) */ if (start) start -= I915_GTT_PAGE_SIZE; @@ -310,7 +308,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, * abutt and conflict. If they are in conflict, then we evict * those as well to make room for our guard pages. */ - if (check_color) { + if (i915_vm_has_cache_coloring(vm)) { if (node->start + node->size ==
[Intel-gfx] [PATCH 1/3] drm/i915: export color_differs
Export color_differs so that we can use it elsewhere. Signed-off-by: Matthew Auld Cc: Chris Wilson Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 11 --- drivers/gpu/drm/i915/i915_vma.h | 6 ++ 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 906dc6fff383..095f5e358a58 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2552,7 +2552,7 @@ static void i915_gtt_color_adjust(const struct drm_mm_node *node, u64 *start, u64 *end) { - if (node->allocated && node->color != color) + if (i915_node_color_differs(node, color)) *start += I915_GTT_PAGE_SIZE; /* Also leave a space between the unallocated reserved node after the diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index e0e677b2a3a9..a90bd2678353 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -477,11 +477,6 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma *vma) vma->flags &= ~I915_VMA_CAN_FENCE; } -static bool color_differs(struct drm_mm_node *node, unsigned long color) -{ - return node->allocated && node->color != color; -} - bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level) { struct drm_mm_node *node = >node; @@ -502,11 +497,13 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level) GEM_BUG_ON(list_empty(>node_list)); other = list_prev_entry(node, node_list); - if (color_differs(other, cache_level) && !drm_mm_hole_follows(other)) + if (i915_node_color_differs(other, cache_level) && + !drm_mm_hole_follows(other)) return false; other = list_next_entry(node, node_list); - if (color_differs(other, cache_level) && !drm_mm_hole_follows(node)) + if (i915_node_color_differs(other, cache_level) && + !drm_mm_hole_follows(node)) return false; return true; diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 889fc7cb910a..5b1e0cf7669d 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -373,6 +373,12 @@ static inline bool i915_vma_is_bound(const struct i915_vma *vma, return vma->flags & where; } +static inline bool i915_node_color_differs(const struct drm_mm_node *node, + unsigned long color) +{ + return node->allocated && node->color != color; +} + /** * i915_vma_pin_iomap - calls ioremap_wc to map the GGTT VMA via the aperture * @vma: VMA to iomap -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 02/11] drm/i915/dsb: DSB context creation.
On 9/7/2019 4:37 PM, Animesh Manna wrote: This patch adds a function, which will internally get the gem buffer for DSB engine. The GEM buffer is from global GTT, and is mapped into CPU domain, contains the data + opcode to be feed to DSB engine. v1: Initial version. v2: - removed some unwanted code. (Chris) - Used i915_gem_object_create_internal instead of _shmem. (Chris) - cmd_buf_tail removed and can be derived through vma object. (Chris) v3: vma realeased if i915_gem_object_pin_map() failed. (Shashank) v4: for simplification and based on current usage added single dsb object in intel_crtc. (Shashank) Cc: Imre Deak Cc: Michel Thierry Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/Makefile | 1 + .../drm/i915/display/intel_display_types.h| 3 + drivers/gpu/drm/i915/display/intel_dsb.c | 70 +++ drivers/gpu/drm/i915/display/intel_dsb.h | 30 drivers/gpu/drm/i915/i915_drv.h | 1 + 5 files changed, 105 insertions(+) create mode 100644 drivers/gpu/drm/i915/display/intel_dsb.c create mode 100644 drivers/gpu/drm/i915/display/intel_dsb.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 658b930d34a8..6313e7b4bd78 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -172,6 +172,7 @@ i915-y += \ display/intel_display_power.o \ display/intel_dpio_phy.o \ display/intel_dpll_mgr.o \ + display/intel_dsb.o \ display/intel_fbc.o \ display/intel_fifo_underrun.o \ display/intel_frontbuffer.o \ diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index d5cc4b810d9e..49c902b00484 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -1033,6 +1033,9 @@ struct intel_crtc { /* scalers available on this crtc */ int num_scalers; + + /* per pipe DSB related info */ + struct intel_dsb dsb; }; struct intel_plane { diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c new file mode 100644 index ..cba5c8d37659 --- /dev/null +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + * + */ + +#include "i915_drv.h" +#include "intel_display_types.h" + +#define DSB_BUF_SIZE(2 * PAGE_SIZE) + +struct intel_dsb * +intel_dsb_get(struct intel_crtc *crtc) +{ + struct drm_device *dev = crtc->base.dev; + struct drm_i915_private *i915 = to_i915(dev); + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + struct intel_dsb *dsb = >dsb; + intel_wakeref_t wakeref; + + if ((!HAS_DSB(i915)) || dsb->cmd_buf) + return dsb; + + dsb->id = DSB1; + wakeref = intel_runtime_pm_get(>runtime_pm); + + obj = i915_gem_object_create_internal(i915, DSB_BUF_SIZE); + if (IS_ERR(obj)) + goto err; + + mutex_lock(>drm.struct_mutex); + vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE); + mutex_unlock(>drm.struct_mutex); + if (IS_ERR(vma)) { + DRM_ERROR("Vma creation failed.\n"); + i915_gem_object_put(obj); + goto err; + } + + dsb->cmd_buf = i915_gem_object_pin_map(vma->obj, I915_MAP_WC); + if (IS_ERR(dsb->cmd_buf)) { + DRM_ERROR("Command buffer creation failed.\n"); + i915_vma_unpin_and_release(, 0); + dsb->cmd_buf = NULL; + goto err; + } + dsb->vma = vma; + +err: + intel_runtime_pm_put(>runtime_pm, wakeref); + return dsb; +} + +void intel_dsb_put(struct intel_dsb *dsb) +{ + struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); + struct drm_i915_private *i915 = to_i915(crtc->base.dev); + + if (!dsb) + return; + + if (dsb->cmd_buf) { + mutex_lock(>drm.struct_mutex); + i915_gem_object_unpin_map(dsb->vma->obj); + i915_vma_unpin_and_release(>vma, 0); + dsb->cmd_buf = NULL; This can be done outside mutex_unlock(); - Shashank + mutex_unlock(>drm.struct_mutex); + } +} diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h new file mode 100644 index ..27eb68eb5392 --- /dev/null +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: MIT + * + * Copyright © 2019 Intel Corporation + */ + +#ifndef _INTEL_DSB_H +#define _INTEL_DSB_H + +struct intel_crtc; +struct i915_vma; + +enum dsb_id { + INVALID_DSB = -1, + DSB1, + DSB2, + DSB3, + MAX_DSB_PER_PIPE +}; + +struct intel_dsb { + enum
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI
== Series Details == Series: drm/i915/display: Mark the modesetting wq as WQ_HIGHPRI URL : https://patchwork.freedesktop.org/series/66439/ State : success == Summary == CI Bug Log - changes from CI_DRM_6854_full -> Patchwork_14330_full Summary --- **SUCCESS** No regressions found. Known issues Here are the changes found in Patchwork_14330_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_reloc@basic-cpu-active: - shard-skl: [PASS][1] -> [DMESG-WARN][2] ([fdo#106107]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl7/igt@gem_exec_re...@basic-cpu-active.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl4/igt@gem_exec_re...@basic-cpu-active.html * igt@gem_exec_schedule@preempt-queue-bsd1: - shard-iclb: [PASS][3] -> [SKIP][4] ([fdo#109276]) +11 similar issues [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb1/igt@gem_exec_sched...@preempt-queue-bsd1.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb8/igt@gem_exec_sched...@preempt-queue-bsd1.html * igt@gem_exec_schedule@preemptive-hang-bsd: - shard-iclb: [PASS][5] -> [SKIP][6] ([fdo#111325]) +3 similar issues [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb5/igt@gem_exec_sched...@preemptive-hang-bsd.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb1/igt@gem_exec_sched...@preemptive-hang-bsd.html * igt@i915_suspend@forcewake: - shard-skl: [PASS][7] -> [INCOMPLETE][8] ([fdo#104108]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl7/igt@i915_susp...@forcewake.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl1/igt@i915_susp...@forcewake.html * igt@kms_flip@flip-vs-suspend-interruptible: - shard-iclb: [PASS][9] -> [INCOMPLETE][10] ([fdo#107713] / [fdo#109507]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_f...@flip-vs-suspend-interruptible.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb7/igt@kms_f...@flip-vs-suspend-interruptible.html * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-render: - shard-iclb: [PASS][11] -> [FAIL][12] ([fdo#103167]) +1 similar issue [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb7/igt@kms_frontbuffer_track...@fbc-1p-primscrn-spr-indfb-draw-render.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb2/igt@kms_frontbuffer_track...@fbc-1p-primscrn-spr-indfb-draw-render.html * igt@kms_frontbuffer_tracking@psr-suspend: - shard-skl: [PASS][13] -> [INCOMPLETE][14] ([fdo#104108] / [fdo#106978]) [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-skl9/igt@kms_frontbuffer_track...@psr-suspend.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-skl6/igt@kms_frontbuffer_track...@psr-suspend.html * igt@kms_plane@pixel-format-pipe-b-planes: - shard-hsw: [PASS][15] -> [INCOMPLETE][16] ([fdo#103540]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-hsw7/igt@kms_pl...@pixel-format-pipe-b-planes.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-hsw4/igt@kms_pl...@pixel-format-pipe-b-planes.html * igt@kms_psr2_su@frontbuffer: - shard-iclb: [PASS][17] -> [SKIP][18] ([fdo#109642] / [fdo#111068]) [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_psr2...@frontbuffer.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb6/igt@kms_psr2...@frontbuffer.html * igt@kms_psr@psr2_sprite_mmap_gtt: - shard-iclb: [PASS][19] -> [SKIP][20] ([fdo#109441]) +1 similar issue [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_gtt.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-iclb5/igt@kms_psr@psr2_sprite_mmap_gtt.html * igt@kms_vblank@pipe-a-ts-continuation-suspend: - shard-apl: [PASS][21] -> [DMESG-WARN][22] ([fdo#108566]) +1 similar issue [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-apl2/igt@kms_vbl...@pipe-a-ts-continuation-suspend.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-apl8/igt@kms_vbl...@pipe-a-ts-continuation-suspend.html * igt@kms_vblank@pipe-b-ts-continuation-suspend: - shard-kbl: [PASS][23] -> [INCOMPLETE][24] ([fdo#103665]) [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/shard-kbl2/igt@kms_vbl...@pipe-b-ts-continuation-suspend.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14330/shard-kbl6/igt@kms_vbl...@pipe-b-ts-continuation-suspend.html Possible fixes
Re: [Intel-gfx] [PATCH 9/9] drm/i915: Expand subslice mask
On Fri, 2019-09-06 at 19:13 +0100, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2019-09-02 14:42:44) > > > > On 24/07/2019 14:05, Tvrtko Ursulin wrote: > > > > > > On 23/07/2019 16:49, Stuart Summers wrote: > > > > +u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, > > > > u8 slice) > > > > +{ > > > > +int i, offset = slice * sseu->ss_stride; > > > > +u32 mask = 0; > > > > + > > > > +if (slice >= sseu->max_slices) { > > > > +DRM_ERROR("%s: invalid slice %d, max: %d\n", > > > > + __func__, slice, sseu->max_slices); > > > > +return 0; > > > > +} > > > > + > > > > +if (sseu->ss_stride > sizeof(mask)) { > > > > +DRM_ERROR("%s: invalid subslice stride %d, max: > > > > %lu\n", > > > > + __func__, sseu->ss_stride, sizeof(mask)); > > > > +return 0; > > > > +} > > > > + > > > > +for (i = 0; i < sseu->ss_stride; i++) > > > > +mask |= (u32)sseu->subslice_mask[offset + i] << > > > > +i * BITS_PER_BYTE; > > > > + > > > > +return mask; > > > > +} > > > > > > Why do you actually need these complications when the plan from > > > the > > > start was that the driver and user sseu representation structures > > > can be > > > different? > > > > > > I only gave it a quick look so I might be wrong, but why not just > > > expand > > > the driver representations of subslice mask up from u8? Userspace > > > API > > > should be able to cope with strides already. > > > > I never got an answer to this and the series was merged in the > > meantime. Thanks for the note here Tvrtko and sorry for the missed response! For some reason I hadn't caught this comment earlier :( > > > > Maybe not much harm but I still don't understand why all the > > complications seemingly just to avoid bumping the *internal* ss > > mask up > > from u8. As long as the internal and abi sseu info struct are well > > separated and access point few and well controlled (I think they > > are) > > then I don't see why the internal side had to be converted to u8 > > and > > strides. But maybe I am missing something. > > I looked at it and thought it was open-coding bitmap.h as well. I > accepted it in good faith that it improved certain use cases and > should > even make tidying up the code without regressing those easier. The goal here is to make sure we have an infrastructure in place that always provides a consistent bit layout to userspace regardless of underlying architecture endianness. Perhaps this could have been made more clear in the commit message here. Thanks, Stuart > -Chris smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't unwedge if reset is disabled
On 9/7/19 1:39 AM, Chris Wilson wrote: Quoting Daniele Ceraolo Spurio (2019-09-06 23:28:05) On 9/5/19 2:09 AM, Janusz Krzysztofik wrote: When trying to reset a device with reset capability disabled or not supported while rings are full of requests, it has been observed when running in execlists submission mode that command stream buffer tail tends to be incremented by apparently still running GPU regardless of all requests being already cancelled and command stream buffer pointers reset. As a result, kernel panic on NULL pointer dereference occurs when a trace_ports() helper is called with command stream buffer tail incremented but request pointers being NULL during final __intel_gt_set_wedged() operation called from intel_gt_reset(). Skip actual reset procedure if reset is disabled or not supported. This last sentence is a bit confusing. You're not skipping the reset procedure, you're skipping the attempt of unwedging and resetting again after a reset & wedge already happened. Loss of email over the last week, so jumping in at the end. My gut response is that this is still just papering over the bug, as what you say above makes no sense. -Chris The issue here is that if we don't reset the HW when we wedge, whatever was running on the engines might complete at any point after that, which generates an unexpected post-wedge CSB event that we don't handle gracefully when we unwedge. The CSB event might arrive at any time (even after the unwedge) or cause weird behavior on the first re-submission, so trying to handle it is not worth the effort IMO since having reset disabled is a debug-only use-case. Daniele ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 03/11] drm/i915/dsb: single register write function for DSB.
On 9/9/2019 6:28 PM, Sharma, Shashank wrote: On 9/7/2019 4:37 PM, Animesh Manna wrote: DSB support single register write through opcode 0x1. Generic api created which accumulate all single register write in a batch buffer and once DSB is triggered, it will program all the registers at the same time. v1: Initial version. v2: Unused macro removed and cosmetic changes done. (Shashank) Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Shashank Sharma Signed-off-by: Animesh Manna --- drivers/gpu/drm/i915/display/intel_dsb.c | 30 drivers/gpu/drm/i915/display/intel_dsb.h | 9 +++ 2 files changed, 39 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index cba5c8d37659..150be81fdfb3 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -9,6 +9,13 @@ #define DSB_BUF_SIZE(2 * PAGE_SIZE) +/* DSB opcodes. */ +#define DSB_OPCODE_SHIFT24 +#define DSB_OPCODE_MMIO_WRITE0x1 +#define DSB_OPCODE_INDEXED_WRITE0x9 +#define DSB_BYTE_EN0xF +#define DSB_BYTE_EN_SHIFT20 + struct intel_dsb * intel_dsb_get(struct intel_crtc *crtc) { @@ -46,6 +53,7 @@ intel_dsb_get(struct intel_crtc *crtc) goto err; } dsb->vma = vma; +dsb->free_pos = 0; This should be done in dsb_put(); err: intel_runtime_pm_put(>runtime_pm, wakeref); @@ -68,3 +76,25 @@ void intel_dsb_put(struct intel_dsb *dsb) mutex_unlock(>drm.struct_mutex); } } + +void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val) +{ +struct intel_crtc *crtc = container_of(dsb, typeof(*crtc), dsb); +struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); +u32 *buf = dsb->cmd_buf; + +if (!buf) { +I915_WRITE(reg, val); +return; +} + +if (WARN_ON(dsb->free_pos >= DSB_BUF_SIZE)) { +DRM_DEBUG_KMS("DSB buffer overflow.\n"); Lets remove this '.' in the end, to maintain consistency in the log. Sure. Regards, Animesh ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: include GTT page-size info in error state
It might prove useful in the future to know if the vma is utilising huge-GTT-pages. Related to this is the GTT cache, where there is some HW "quirkiness" where it must be disabled if using 2M pages, so include that for good measure. Suggested-by: Chris Wilson Signed-off-by: Matthew Auld Cc: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 1 - drivers/gpu/drm/i915/i915_gpu_error.c| 10 ++ drivers/gpu/drm/i915/i915_gpu_error.h| 2 ++ 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 13b9dc0e1a89..a558edf15ec8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -160,7 +160,6 @@ struct drm_i915_gem_object { struct sg_table *pages; void *mapping; - /* TODO: whack some of this into the error state */ struct i915_page_sizes { /** * The sg mask of the pages sg_table. i.e the mask of diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 3ccf7fd9307f..6384a06aa5bf 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -575,6 +575,9 @@ static void print_error_obj(struct drm_i915_error_state_buf *m, lower_32_bits(obj->gtt_offset)); } + if (obj->gtt_page_sizes > I915_GTT_PAGE_SIZE_4K) + err_printf(m, "gtt_page_sizes = 0x%08x\n", obj->gtt_page_sizes); + err_compression_marker(m); for (page = 0; page < obj->page_count; page++) { int i, len; @@ -735,6 +738,9 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m, if (IS_GEN(m->i915, 7)) err_printf(m, "ERR_INT: 0x%08x\n", error->err_int); + if (IS_GEN_RANGE(m->i915, 8, 11)) + err_printf(m, "GTT_CACHE_EN: 0x%08x\n", error->gtt_cache); + for (ee = error->engine; ee; ee = ee->next) error_print_engine(m, ee, error->epoch); @@ -985,6 +991,7 @@ i915_error_object_create(struct drm_i915_private *i915, dst->gtt_offset = vma->node.start; dst->gtt_size = vma->node.size; + dst->gtt_page_sizes = vma->page_sizes.gtt; dst->num_pages = num_pages; dst->page_count = 0; dst->unused = 0; @@ -1554,6 +1561,9 @@ static void capture_reg_state(struct i915_gpu_state *error) error->gac_eco = intel_uncore_read(uncore, GAC_ECO_BITS); } + if (IS_GEN_RANGE(i915, 8, 11)) + error->gtt_cache = intel_uncore_read(uncore, HSW_GTT_CACHE_EN); + /* 4: Everything else */ if (INTEL_GEN(i915) >= 11) { error->ier = intel_uncore_read(uncore, GEN8_DE_MISC_IER); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index df9f57766626..63cf387411e0 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -74,6 +74,7 @@ struct i915_gpu_state { u32 gam_ecochk; u32 gab_ctl; u32 gfx_mode; + u32 gtt_cache; u32 nfence; u64 fence[I915_MAX_NUM_FENCES]; @@ -127,6 +128,7 @@ struct i915_gpu_state { struct drm_i915_error_object { u64 gtt_offset; u64 gtt_size; + u32 gtt_page_sizes; int num_pages; int page_count; int unused; -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2
== Series Details == Series: series starting with [CI,01/13] drm/i915: introduce a mechanism to extend execbuf2 URL : https://patchwork.freedesktop.org/series/66418/ State : success == Summary == CI Bug Log - changes from CI_DRM_6852 -> Patchwork_14322 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/ Known issues Here are the changes found in Patchwork_14322 that come from known issues: ### IGT changes ### Issues hit * igt@gem_ctx_switch@legacy-render: - fi-bxt-dsi: [PASS][1] -> [INCOMPLETE][2] ([fdo#103927] / [fdo#111381]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-bxt-dsi/igt@gem_ctx_swi...@legacy-render.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-bxt-dsi/igt@gem_ctx_swi...@legacy-render.html * igt@i915_selftest@live_gem_contexts: - fi-skl-guc: [PASS][3] -> [INCOMPLETE][4] ([fdo#111519]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-skl-guc/igt@i915_selftest@live_gem_contexts.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-skl-guc/igt@i915_selftest@live_gem_contexts.html * igt@prime_vgem@basic-fence-flip: - fi-ilk-650: [PASS][5] -> [DMESG-WARN][6] ([fdo#106387]) +1 similar issue [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-ilk-650/igt@prime_v...@basic-fence-flip.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-ilk-650/igt@prime_v...@basic-fence-flip.html Possible fixes * igt@gem_ctx_create@basic-files: - fi-icl-u2: [INCOMPLETE][7] ([fdo#107713] / [fdo#109100]) -> [PASS][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u2/igt@gem_ctx_cre...@basic-files.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u2/igt@gem_ctx_cre...@basic-files.html * igt@gem_exec_fence@nb-await-default: - fi-icl-u3: [DMESG-WARN][9] ([fdo#107724]) -> [PASS][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u3/igt@gem_exec_fe...@nb-await-default.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u3/igt@gem_exec_fe...@nb-await-default.html * igt@kms_frontbuffer_tracking@basic: - fi-icl-u3: [FAIL][11] ([fdo#103167]) -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6852/fi-icl-u3/igt@kms_frontbuffer_track...@basic.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14322/fi-icl-u3/igt@kms_frontbuffer_track...@basic.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167 [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927 [fdo#106387]: https://bugs.freedesktop.org/show_bug.cgi?id=106387 [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713 [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724 [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100 [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381 [fdo#111519]: https://bugs.freedesktop.org/show_bug.cgi?id=111519 Participating hosts (52 -> 46) -- Additional (2): fi-skl-6770hq fi-skl-6700k2 Missing(8): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_6852 -> Patchwork_14322 CI-20190529: 20190529 CI_DRM_6852: d45d78ff950be956657e1236785714509a7d43be @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5173: 3fb0f227d8856008f89a797879e27094745ce97e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_14322: 668a68776eddfd3b529fe98c2babe5d7ce2da381 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 668a68776edd drm/i915: add support for perf configuration queries fc2e0e5e7126 drm/i915/perf: allow holding preemption on filtered ctx 8a629929451c drm/i915: add a new perf configuration execbuf parameter 647d2458b7c3 drm/i915/perf: execute OA configuration from command stream 0432b1e0d15d drm/i915: add wait flags to i915_active_request_retire 4ef9530de87e drm/i915/perf: implement active wait for noa configurations da1c41cf2065 drm/i915/perf: allow for CS OA configs to be created lazily 8bb8be52ca97 drm/i915/perf: move perf types to their own header 8db92539084e drm/i915/perf: introduce a versioning of the i915-perf uapi 8aca4673ec28 drm/i915/perf: store the associated engine of a stream 66b65143aa4d drm/i915/perf: drop list of streams 503c88dc3bc0 drm/i915: add syncobj timeline support 66b565b57b3f drm/i915: introduce a mechanism to extend execbuf2 == Logs == For more details see:
[Intel-gfx] [PATCH v16 04/13] drm/i915/perf: store the associated engine of a stream
We'll use this information later to verify that a client trying to reconfigure the stream does so on the right engine. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_perf.c | 7 +++ 2 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 75607450ba00..274a1193d4f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1088,6 +1088,11 @@ struct i915_perf_stream { */ intel_wakeref_t wakeref; + /** +* @engine: Engine associated with this performance stream. +*/ + struct intel_engine_cs *engine; + /** * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` * properties given when opening a stream, representing the contents diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d18cd332afb7..9d5a3522aa35 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -363,6 +363,8 @@ struct perf_open_properties { int oa_format; bool oa_periodic; int oa_period_exponent; + + struct intel_engine_cs *engine; }; static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); @@ -2201,6 +2203,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, format_size = dev_priv->perf.oa_formats[props->oa_format].size; + stream->engine = props->engine; + stream->sample_flags |= SAMPLE_OA_REPORT; stream->sample_size += format_size; @@ -2843,6 +2847,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, return -EINVAL; } + /* At the moment we only support using i915-perf on the RCS. */ + props->engine = dev_priv->engine[RCS0]; + /* Considering that ID = 0 is reserved and assuming that we don't * (currently) expect any configurations to ever specify duplicate * values for a particular property ID then the last _PROP_MAX value is -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v16 09/13] drm/i915: add wait flags to i915_active_request_retire
An upcoming change needs not to be interrupted. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_active.c | 4 +++- drivers/gpu/drm/i915/i915_active.h | 5 ++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 6a447f1d0110..c808c28c9464 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -425,7 +425,9 @@ int i915_active_wait(struct i915_active *ref) break; } - err = i915_active_request_retire(>base, BKL(ref)); + err = i915_active_request_retire(>base, +I915_WAIT_INTERRUPTIBLE, +BKL(ref)); if (err) break; } diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index f95058f99057..35a6089b44fd 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -309,6 +309,7 @@ i915_active_request_isset(const struct i915_active_request *active) */ static inline int __must_check i915_active_request_retire(struct i915_active_request *active, + unsigned int flags, struct mutex *mutex) { struct i915_request *request; @@ -318,9 +319,7 @@ i915_active_request_retire(struct i915_active_request *active, if (!request) return 0; - ret = i915_request_wait(request, - I915_WAIT_INTERRUPTIBLE, - MAX_SCHEDULE_TIMEOUT); + ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT); if (ret < 0) return ret; -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v16 08/13] drm/i915/perf: implement active wait for noa configurations
NOA configuration take some amount of time to apply. That amount of time depends on the size of the GT. There is no documented time for this. For example, past experimentations with powergating configuration changes seem to indicate a 60~70us delay. We go with 500us as default for now which should be over the required amount of time (according to HW architects). v2: Don't forget to save/restore registers used for the wait (Chris) v3: Name used CS_GPR registers (Chris) Fix compile issue due to rebase (Lionel) v4: Fix save/restore helpers (Umesh) v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel) v6: Add missing struct declarations in i915_perf.h Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 24 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 5 + drivers/gpu/drm/i915/i915_debugfs.c | 30 +++ drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 233 ++- drivers/gpu/drm/i915/i915_perf_types.h | 6 + drivers/gpu/drm/i915/i915_reg.h | 4 +- 7 files changed, 300 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index b6373fbc927d..fab318c71d24 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -160,6 +160,7 @@ #define MI_BATCH_GTT (2<<6) /* aliased with (1<<7) on gen4 */ #define MI_BATCH_BUFFER_START_GEN8 MI_INSTR(0x31, 1) #define MI_BATCH_RESOURCE_STREAMER (1<<10) +#define MI_BATCH_PREDICATE (1 << 15) /* HSW+ on RCS only*/ /* * 3D instructions used by the kernel @@ -238,6 +239,29 @@ #define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1<<0) #define PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */ +#define MI_MATH(x) MI_INSTR(0x1a, (x) - 1) +#define MI_ALU_OP(op, src1, src2) (((op) << 20) | ((src1) << 10) | (src2)) +/* operands */ +#define MI_ALU_OP_NOOP 0 +#define MI_ALU_OP_LOAD 128 +#define MI_ALU_OP_LOADINV 1152 +#define MI_ALU_OP_LOAD0129 +#define MI_ALU_OP_LOAD11153 +#define MI_ALU_OP_ADD 256 +#define MI_ALU_OP_SUB 257 +#define MI_ALU_OP_AND 258 +#define MI_ALU_OP_OR 259 +#define MI_ALU_OP_XOR 260 +#define MI_ALU_OP_STORE384 +#define MI_ALU_OP_STOREINV 1408 +/* sources */ +#define MI_ALU_SRC_REG(x) (x) /* 0 -> 15 */ +#define MI_ALU_SRC_SRCA32 +#define MI_ALU_SRC_SRCB33 +#define MI_ALU_SRC_ACCU49 +#define MI_ALU_SRC_ZF 50 +#define MI_ALU_SRC_CF 51 + /* * Commands used only by the command parser */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index dc295c196d11..f752b6cf9ea1 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -97,6 +97,11 @@ enum intel_gt_scratch_field { /* 8 bytes */ INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256, + /* 6 * 8 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048, + + /* 4 bytes */ + INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096, }; #endif /* __INTEL_GT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 708855e051b5..b00b1a6f8d68 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3578,6 +3578,35 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops, i915_wedged_get, i915_wedged_set, "%llu\n"); +static int +i915_perf_noa_delay_set(void *data, u64 val) +{ + struct drm_i915_private *i915 = data; + + /* This would lead to infinite waits as we're doing timestamp +* difference on the CS with only 32bits. +*/ + if (val > mul_u32_u32(U32_MAX, RUNTIME_INFO(i915)->cs_timestamp_frequency_khz)) + return -EINVAL; + + atomic64_set(>perf.noa_programming_delay, val); + return 0; +} + +static int +i915_perf_noa_delay_get(void *data, u64 *val) +{ + struct drm_i915_private *i915 = data; + + *val = atomic64_read(>perf.noa_programming_delay); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_perf_noa_delay_fops, + i915_perf_noa_delay_get, + i915_perf_noa_delay_set, + "%llu\n"); + #define DROP_UNBOUND BIT(0) #define DROP_BOUND BIT(1) #define DROP_RETIREBIT(2) @@ -4354,6 +4383,7 @@ static const struct i915_debugfs_files { const char *name; const struct file_operations *fops; } i915_debugfs_files[] = { + {"i915_perf_noa_delay", _perf_noa_delay_fops}, {"i915_wedged", _wedged_fops}, {"i915_cache_sharing", _cache_sharing_fops}, {"i915_gem_drop_caches", _drop_caches_fops}, diff --git
[Intel-gfx] [PATCH 2/2] drm/i915/selftests: Verify the LRC register layout between init and HW
Before we submit the first context to HW, we need to construct a valid image of the register state. This layout is defined by the HW and should match the layout generated by HW when it saves the context image. Asserting that this should be equivalent should help avoid any undefined behaviour and verify that we haven't missed anything important! Of course, having insisted that the initial register state within the LRC should match that returned by HW, we need to ensure that it does. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 656 -- drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 62 +- drivers/gpu/drm/i915/gt/selftest_lrc.c| 140 drivers/gpu/drm/i915/i915_perf.c | 35 +- drivers/gpu/drm/i915/i915_perf.h | 5 +- .../drm/i915/selftests/i915_live_selftests.h | 1 + 7 files changed, 638 insertions(+), 263 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index f1c0e5d958f3..3eb3c4fab110 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1115,7 +1115,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq, offset = i915_ggtt_offset(ce->state) + LRC_STATE_PN * PAGE_SIZE + -(CTX_R_PWR_CLK_STATE + 1) * 4; +CTX_R_PWR_CLK_STATE * 4; *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; *cs++ = lower_32_bits(offset); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 0ddfbebbcbbc..e369dba3c06a 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -230,9 +230,9 @@ static int __execlists_context_alloc(struct intel_context *ce, struct intel_engine_cs *engine); static void execlists_init_reg_state(u32 *reg_state, -struct intel_context *ce, -struct intel_engine_cs *engine, -struct intel_ring *ring); +const struct intel_context *ce, +const struct intel_engine_cs *engine, +const struct intel_ring *ring); static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine) { @@ -464,6 +464,411 @@ lrc_descriptor(struct intel_context *ce, struct intel_engine_cs *engine) return desc; } +static u32 *set_offsets(u32 *regs, + const u8 *data, + const struct intel_engine_cs *engine) +#define NOP(x) (BIT(7) | (x)) +#define LRI(count, flags) ((flags) << 6 | (count)) +#define POSTED BIT(0) +#define REG(x) (((x) >> 2) | BUILD_BUG_ON_ZERO(x >= 0x200)) +#define REG16(x) \ + (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \ + (((x) >> 2) & 0x7f) +#define END() 0 +{ + const u32 base = engine->mmio_base; + + while (*data) { + u8 count, flags; + + if (*data & BIT(7)) { /* skip */ + regs += *data++ & ~BIT(7); + continue; + } + + count = *data & 0x3f; + flags = *data >> 6; + data++; + + *regs = MI_LOAD_REGISTER_IMM(count); + if (flags & POSTED) + *regs |= MI_LRI_FORCE_POSTED; + if (INTEL_GEN(engine->i915) >= 11) + *regs |= MI_LRI_CS_MMIO; + regs++; + + GEM_BUG_ON(!count); + do { + u32 offset = 0; + u8 v; + + do { + v = *data++; + offset <<= 7; + offset |= v & ~BIT(7); + } while (v & BIT(7)); + + *regs = base + (offset << 2); + regs += 2; + } while (--count); + } + + return regs; +} + +static const u8 gen8_xcs_offsets[] = { + NOP(1), + LRI(11, 0), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x11c), + REG(0x114), + REG(0x118), + + NOP(9), + LRI(9, 0), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + NOP(13), + LRI(2, 0), + REG16(0x200), + REG(0x028), + + END(), +}; + +static const u8 gen9_xcs_offsets[] = { + NOP(1), + LRI(14, POSTED), + REG16(0x244), +
[Intel-gfx] [PATCH v16 05/13] drm/i915/perf: introduce a versioning of the i915-perf uapi
Reporting this version will help application figure out what level of the support the running kernel provides. v2: Add i915_perf_ioctl_version() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_getparam.c | 4 drivers/gpu/drm/i915/i915_perf.c | 10 ++ drivers/gpu/drm/i915/i915_perf.h | 1 + include/uapi/drm/i915_drm.h | 20 4 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index da6faa84e5b8..bd41cc5ce906 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -5,6 +5,7 @@ #include "gt/intel_engine_user.h" #include "i915_drv.h" +#include "i915_perf.h" int i915_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) @@ -157,6 +158,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_MMAP_GTT_COHERENT: value = INTEL_INFO(i915)->has_coherent_ggtt; break; + case I915_PARAM_PERF_REVISION: + value = i915_perf_ioctl_version(); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9d5a3522aa35..40a1ec2bc96b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3697,3 +3697,13 @@ void i915_perf_fini(struct drm_i915_private *dev_priv) dev_priv->perf.initialized = false; } + +/** + * i915_perf_ioctl_version - Version of the i915-perf subsystem + * + * This version number is used by userspace to detect available features. + */ +int i915_perf_ioctl_version(void) +{ + return 1; +} diff --git a/drivers/gpu/drm/i915/i915_perf.h b/drivers/gpu/drm/i915/i915_perf.h index a412b16d9ffc..95549de65212 100644 --- a/drivers/gpu/drm/i915/i915_perf.h +++ b/drivers/gpu/drm/i915/i915_perf.h @@ -18,6 +18,7 @@ void i915_perf_init(struct drm_i915_private *i915); void i915_perf_fini(struct drm_i915_private *i915); void i915_perf_register(struct drm_i915_private *i915); void i915_perf_unregister(struct drm_i915_private *i915); +int i915_perf_ioctl_version(void); int i915_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 3d031e81648b..e98c9a7baa91 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -618,6 +618,12 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 54 +/* + * Revision of the i915-perf uAPI. The value returned helps determine what + * i915-perf features are available. See drm_i915_perf_property_id. + */ +#define I915_PARAM_PERF_REVISION 55 + /* Must be kept compact -- no holes and well documented */ typedef struct drm_i915_getparam { @@ -1903,23 +1909,31 @@ enum drm_i915_perf_property_id { * Open the stream for a specific context handle (as used with * execbuffer2). A stream opened for a specific context this way * won't typically require root privileges. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_CTX_HANDLE = 1, /** * A value of 1 requests the inclusion of raw OA unit reports as * part of stream samples. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_SAMPLE_OA, /** * The value specifies which set of OA unit metrics should be * be configured, defining the contents of any OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_METRICS_SET, /** * The value specifies the size and layout of OA unit reports. +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_FORMAT, @@ -1929,6 +1943,8 @@ enum drm_i915_perf_property_id { * from this exponent as follows: * * 80ns * 2^(period_exponent + 1) +* +* This property is available in perf revision 1. */ DRM_I915_PERF_PROP_OA_EXPONENT, @@ -1960,6 +1976,8 @@ struct drm_i915_perf_open_param { * to close and re-open a stream with the same configuration. * * It's undefined whether any pending data for the stream will be lost. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_ENABLE _IO('i', 0x0) @@ -1967,6 +1985,8 @@ struct drm_i915_perf_open_param { * Disable data capture for a stream. * * It is an error to try and read a stream that is disabled. + * + * This ioctl is available in perf revision 1. */ #define I915_PERF_IOCTL_DISABLE_IO('i',
[Intel-gfx] [PATCH v16 13/13] drm/i915: add support for perf configuration queries
Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 6 + drivers/gpu/drm/i915/i915_perf.c | 3 + drivers/gpu/drm/i915/i915_query.c | 282 ++ include/uapi/drm/i915_drm.h | 65 ++- 4 files changed, 353 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2c6f37219dff..eab42269fc5b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1368,6 +1368,12 @@ struct drm_i915_private { */ struct idr metrics_idr; + /* +* Number of dynamic configurations, you need to hold +* dev_priv->perf.metrics_lock to access it. +*/ + u32 n_metrics; + /* * Lock associated with anything below within this structure * except exclusive_stream. diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0ffcb8d16154..cf392e4d6870 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3917,6 +3917,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, goto sysfs_err; } + dev_priv->perf.n_metrics++; + mutex_unlock(_priv->perf.metrics_lock); DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id); @@ -3977,6 +3979,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, _config->sysfs_metric); idr_remove(_priv->perf.metrics_idr, *arg); + dev_priv->perf.n_metrics--; mutex_unlock(_priv->perf.metrics_lock); diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index abac5042da2b..e1f0c184a209 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -7,6 +7,7 @@ #include #include "i915_drv.h" +#include "i915_perf.h" #include "i915_query.h" #include @@ -140,10 +141,291 @@ query_engine_info(struct drm_i915_private *i915, return len; } +static int can_copy_perf_config_registers_or_number(u32 user_n_regs, + u64 user_regs_ptr, + u32 kernel_n_regs) +{ + /* +* We'll just put the number of registers, and won't copy the +* register. +*/ + if (user_n_regs == 0) + return 0; + + if (user_n_regs < kernel_n_regs) + return -EINVAL; + + if (!access_ok(u64_to_user_ptr(user_regs_ptr), + 2 * sizeof(u32) * kernel_n_regs)) + return -EFAULT; + + return 0; +} + +static int copy_perf_config_registers_or_number(const struct i915_oa_reg *kernel_regs, + u32 kernel_n_regs, + u64 user_regs_ptr, + u32 *user_n_regs) +{ + u32 r; + + if (*user_n_regs == 0) { + *user_n_regs = kernel_n_regs; + return 0; + } + + *user_n_regs = kernel_n_regs; + + for (r = 0; r < kernel_n_regs; r++) { + u32 __user *user_reg_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2); + u32 __user *user_val_ptr = + u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 + + sizeof(u32)); + int ret; + + ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr), +user_reg_ptr); + if (ret) + return -EFAULT; + + ret = __put_user(kernel_regs[r].value, user_val_ptr); + if (ret) + return -EFAULT; + } + + return 0; +} + +static int query_perf_config_data(struct drm_i915_private *i915, + struct drm_i915_query_item *query_item, + bool use_uuid)
[Intel-gfx] [PATCH v16 07/13] drm/i915/perf: allow for CS OA configs to be created lazily
Here we introduce a mechanism by which the execbuf part of the i915 driver will be able to request that a batch buffer containing the programming for a particular OA config be created. We'll execute these OA configuration buffers right before executing a set of userspace commands so that a particular user batchbuffer be executed with a given OA configuration. This mechanism essentially allows the userspace driver to go through several OA configuration without having to open/close the i915/perf stream. v2: No need for locking on object OA config object creation (Chris) Flush cpu mapping of OA config (Chris) v3: Properly deal with the perf_metric lock (Chris/Lionel) v4: Fix oa config unref/put when not found (Lionel) v5: Allocate BOs for configurations on the stream instead of globally (Lionel) v6: Fix 64bit division (Chris) v7: Store allocated config BOs into the stream (Lionel) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 + drivers/gpu/drm/i915/i915_drv.h | 4 +- drivers/gpu/drm/i915/i915_perf.c | 270 --- drivers/gpu/drm/i915/i915_perf.h | 26 ++ drivers/gpu/drm/i915/i915_perf_types.h | 15 +- 5 files changed, 273 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index fbad403ab7ac..b6373fbc927d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -135,6 +135,7 @@ /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ #define MI_LRI_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126) #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1) #define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2) #define MI_SRM_LRM_GLOBAL_GTT(1<<22) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f4145ae6ab6e..7eb31923cde9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1363,8 +1363,8 @@ struct drm_i915_private { struct mutex metrics_lock; /* -* List of dynamic configurations, you need to hold -* dev_priv->perf.metrics_lock to access it. +* List of dynamic configurations (struct i915_oa_config), you +* need to hold dev_priv->perf.metrics_lock to access it. */ struct idr metrics_idr; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 40a1ec2bc96b..93a424c4a577 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -367,11 +367,19 @@ struct perf_open_properties { struct intel_engine_cs *engine; }; +struct i915_oa_config_bo { + struct list_head link; + + struct i915_oa_config *oa_config; + struct drm_i915_gem_object *bo; +}; + static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); -static void free_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +void i915_oa_config_release(struct kref *ref) { + struct i915_oa_config *oa_config = container_of(ref, typeof(*oa_config), ref); + if (!PTR_ERR(oa_config->flex_regs)) kfree(oa_config->flex_regs); if (!PTR_ERR(oa_config->b_counter_regs)) @@ -381,40 +389,194 @@ static void free_oa_config(struct drm_i915_private *dev_priv, kfree(oa_config); } -static void put_oa_config(struct drm_i915_private *dev_priv, - struct i915_oa_config *oa_config) +static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs) { - if (!atomic_dec_and_test(_config->ref_count)) - return; + u32 i; + + for (i = 0; i < n_regs; i++) { + if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) { + u32 n_lri = min(n_regs - i, + (u32)MI_LOAD_REGISTER_IMM_MAX_REGS); - free_oa_config(dev_priv, oa_config); + *cs++ = MI_LOAD_REGISTER_IMM(n_lri); + } + *cs++ = i915_mmio_reg_offset(reg_data[i].addr); + *cs++ = reg_data[i].value; + } + + return cs; } -static int get_oa_config(struct drm_i915_private *dev_priv, -int metrics_set, -struct i915_oa_config **out_config) +static struct i915_oa_config_bo *alloc_oa_config_buffer(struct drm_i915_private *i915, + struct i915_oa_config *oa_config) { - int ret; + struct i915_oa_config_bo *oa_bo; + size_t config_length = 0; + u32 *cs; + int err; + + oa_bo = kzalloc(sizeof(*oa_bo),
[Intel-gfx] [PATCH v16 10/13] drm/i915/perf: execute OA configuration from command stream
We haven't run into issues with programming the global OA/NOA registers configuration from CPU so far, but HW engineers actually recommend doing this from the command streamer. On TGL in particular one of the clock domain in which some of that programming goes might not be powered when we poke things from the CPU. Since we have a command buffer prepared for the execbuffer side of things, we can reuse that approach here too. This also allows us to significantly reduce the amount of time we hold the main lock. v2: Drop the global lock as much as possible v3: Take global lock to pin global v4: Create i915 request in emit_oa_config() to avoid deadlocks (Lionel) v5: Move locking to the stream (Lionel) v6: Move active reconfiguration request into i915_perf_stream (Lionel) v7: Pin VMA outside request creation (Chris) Lock VMA before move to active (Chris) v8: Fix double free on stream->initial_oa_config_bo (Lionel) Don't allow interruption when waiting on active config request (Lionel) v9: Don't ignore return value from i915_active_request_retire (Lionel) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 174 - drivers/gpu/drm/i915/i915_perf_types.h | 15 ++- 2 files changed, 128 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d01494180465..929ab54ee371 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1557,18 +1557,23 @@ free_oa_configs(struct i915_perf_stream *stream) static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; + int err; BUG_ON(stream != dev_priv->perf.exclusive_stream); - /* -* Unset exclusive_stream first, it will be checked while disabling -* the metric set on gen8+. -*/ mutex_lock(_priv->drm.struct_mutex); - dev_priv->perf.exclusive_stream = NULL; + mutex_lock(>config_mutex); dev_priv->perf.ops.disable_metric_set(stream); + err = i915_active_request_retire(>active_config_rq, 0, +>config_mutex); + mutex_unlock(>config_mutex); + dev_priv->perf.exclusive_stream = NULL; mutex_unlock(_priv->drm.struct_mutex); + if (err) + DRM_ERROR("Failed to disable perf stream\n"); + + free_oa_buffer(stream); free_noa_wait(stream); @@ -1794,6 +1799,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return PTR_ERR(bo); } + ret = i915_mutex_lock_interruptible(>drm); + if (ret) + goto err_unref; + /* * We pin in GGTT because we jump into this buffer now because * multiple OA config BOs will have a jump to this address and it @@ -1801,10 +1810,13 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) */ vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 4096, 0); if (IS_ERR(vma)) { + mutex_unlock(>drm.struct_mutex); ret = PTR_ERR(vma); goto err_unref; } + mutex_unlock(>drm.struct_mutex); + batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); if (IS_ERR(batch)) { ret = PTR_ERR(batch); @@ -1938,7 +1950,9 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return 0; err_unpin: - __i915_vma_unpin(vma); + mutex_lock(>drm.struct_mutex); + i915_vma_unpin_and_release(, 0); + mutex_unlock(>drm.struct_mutex); err_unref: i915_gem_object_put(bo); @@ -1946,50 +1960,73 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return ret; } -static void config_oa_regs(struct drm_i915_private *dev_priv, - const struct i915_oa_reg *regs, - u32 n_regs) +static int emit_oa_config(struct drm_i915_private *i915, + struct i915_perf_stream *stream) { - u32 i; + struct i915_request *rq; + struct i915_vma *vma; + u32 *cs; + int err; - for (i = 0; i < n_regs; i++) { - const struct i915_oa_reg *reg = regs + i; + lockdep_assert_held(>config_mutex); + + vma = i915_vma_instance(stream->initial_oa_config_bo, + >engine->gt->ggtt->vm, NULL); + if (unlikely(IS_ERR(vma))) + return PTR_ERR(vma); + + err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL); + if (err) + goto err_vma_unpin; - I915_WRITE(reg->addr, reg->value); + rq = i915_request_create(stream->engine->kernel_context); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err_add_request; } -} -static void delay_after_mux(void) -{ - /* -* It apparently takes a fairly long time for a new MUX -
[Intel-gfx] [PATCH 1/2] drm/i915: Show the logical context ring state on dumping
Include the active context register state when dumping the engine. Suggested-by: Mika Kuoppala Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index a8014c59b388..3c176b0f4b45 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1404,6 +1404,11 @@ void intel_engine_dump(struct intel_engine_cs *engine, rq->timeline->hwsp_offset); print_request_ring(m, rq); + + if (rq->hw_context->lrc_reg_state) { + drm_printf(m, "Logical Ring Context:\n"); + hexdump(m, rq->hw_context->lrc_reg_state, PAGE_SIZE); + } } spin_unlock_irqrestore(>active.lock, flags); -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v16 01/13] drm/i915: introduce a mechanism to extend execbuf2
We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson (v1) --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 39 ++- include/uapi/drm/i915_drm.h | 26 +++-- 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 27dbcb508055..4f5fd946ab28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -25,6 +25,7 @@ #include "i915_gem_context.h" #include "i915_gem_ioctls.h" #include "i915_trace.h" +#include "i915_user_extensions.h" enum { FORCE_CPU_RELOC = 1, @@ -272,6 +273,10 @@ struct i915_execbuffer { */ int lut_size; struct hlist_head *buckets; /** ht for relocation handles */ + + struct { + u64 flags; /** Available extensions parameters */ + } extensions; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1940,7 +1945,8 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec) return false; /* Kernel clipping was a DRI1 misfeature */ - if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) { + if (!(exec->flags & (I915_EXEC_FENCE_ARRAY | +I915_EXEC_USE_EXTENSIONS))) { if (exec->num_cliprects || exec->cliprects_ptr) return false; } @@ -2442,6 +2448,33 @@ signal_fence_array(struct i915_execbuffer *eb, } } +static const i915_user_extension_fn execbuf_extensions[] = { +}; + +static int +parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args, + struct i915_execbuffer *eb) +{ + eb->extensions.flags = 0; + + if (!(args->flags & I915_EXEC_USE_EXTENSIONS)) + return 0; + + /* The execbuf2 extension mechanism reuses cliprects_ptr. So we cannot +* have another flag also using it at the same time. +*/ + if (eb->args->flags & I915_EXEC_FENCE_ARRAY) + return -EINVAL; + + if (args->num_cliprects != 0) + return -EINVAL; + + return i915_user_extensions(u64_to_user_ptr(args->cliprects_ptr), + execbuf_extensions, + ARRAY_SIZE(execbuf_extensions), + eb); +} + static int i915_gem_do_execbuffer(struct drm_device *dev, struct drm_file *file, @@ -2488,6 +2521,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (args->flags & I915_EXEC_IS_PINNED) eb.batch_flags |= I915_DISPATCH_PINNED; + err = parse_execbuf2_extensions(args, ); + if (err) + return err; + if (args->flags & I915_EXEC_FENCE_IN) { in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2)); if (!in_fence) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 469dc512cca3..0a99c26730e1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1007,6 +1007,10 @@ struct drm_i915_gem_exec_fence { __u32 flags; }; +enum drm_i915_gem_execbuffer_ext { + DRM_I915_GEM_EXECBUFFER_EXT_MAX /* non-ABI */ +}; + struct drm_i915_gem_execbuffer2 { /** * List of gem_exec_object2 structs @@ -1023,8 +1027,15 @@ struct drm_i915_gem_execbuffer2 { __u32 num_cliprects; /** * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY -* is not set. If I915_EXEC_FENCE_ARRAY is set, then this is a -* struct drm_i915_gem_exec_fence *fences. +* & I915_EXEC_USE_EXTENSIONS are not set. +* +* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array +* of struct drm_i915_gem_exec_fence and num_cliprects is the length +* of the array. +* +* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a +* single struct drm_i915_gem_base_execbuffer_ext and num_cliprects is +* 0. */ __u64 cliprects_ptr; #define I915_EXEC_RING_MASK (0x3f) @@ -1142,7 +1153,16 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_FENCE_SUBMIT (1 << 20) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1)) +/* + * Setting I915_EXEC_USE_EXTENSIONS implies that + * drm_i915_gem_execbuffer2.cliprects_ptr is treated as a pointer to an linked + * list of i915_user_extension. Each i915_user_extension node is the base of a + * larger structure. The list of supported structures are listed in the + * drm_i915_gem_execbuffer_ext enum. + */ +#define
[Intel-gfx] [PATCH v16 06/13] drm/i915/perf: move perf types to their own header
Following a pattern used throughout the driver. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h| 300 +-- drivers/gpu/drm/i915/i915_perf.h | 2 + drivers/gpu/drm/i915/i915_perf_types.h | 327 + 3 files changed, 330 insertions(+), 299 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 274a1193d4f0..f4145ae6ab6e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -92,6 +92,7 @@ #include "i915_gem_fence_reg.h" #include "i915_gem_gtt.h" #include "i915_gpu_error.h" +#include "i915_perf_types.h" #include "i915_request.h" #include "i915_scheduler.h" #include "gt/intel_timeline.h" @@ -979,305 +980,6 @@ struct intel_wm_config { bool sprites_scaled; }; -struct i915_oa_format { - u32 format; - int size; -}; - -struct i915_oa_reg { - i915_reg_t addr; - u32 value; -}; - -struct i915_oa_config { - char uuid[UUID_STRING_LEN + 1]; - int id; - - const struct i915_oa_reg *mux_regs; - u32 mux_regs_len; - const struct i915_oa_reg *b_counter_regs; - u32 b_counter_regs_len; - const struct i915_oa_reg *flex_regs; - u32 flex_regs_len; - - struct attribute_group sysfs_metric; - struct attribute *attrs[2]; - struct device_attribute sysfs_metric_id; - - atomic_t ref_count; -}; - -struct i915_perf_stream; - -/** - * struct i915_perf_stream_ops - the OPs to support a specific stream type - */ -struct i915_perf_stream_ops { - /** -* @enable: Enables the collection of HW samples, either in response to -* `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened -* without `I915_PERF_FLAG_DISABLED`. -*/ - void (*enable)(struct i915_perf_stream *stream); - - /** -* @disable: Disables the collection of HW samples, either in response -* to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying -* the stream. -*/ - void (*disable)(struct i915_perf_stream *stream); - - /** -* @poll_wait: Call poll_wait, passing a wait queue that will be woken -* once there is something ready to read() for the stream -*/ - void (*poll_wait)(struct i915_perf_stream *stream, - struct file *file, - poll_table *wait); - - /** -* @wait_unlocked: For handling a blocking read, wait until there is -* something to ready to read() for the stream. E.g. wait on the same -* wait queue that would be passed to poll_wait(). -*/ - int (*wait_unlocked)(struct i915_perf_stream *stream); - - /** -* @read: Copy buffered metrics as records to userspace -* **buf**: the userspace, destination buffer -* **count**: the number of bytes to copy, requested by userspace -* **offset**: zero at the start of the read, updated as the read -* proceeds, it represents how many bytes have been copied so far and -* the buffer offset for copying the next record. -* -* Copy as many buffered i915 perf samples and records for this stream -* to userspace as will fit in the given buffer. -* -* Only write complete records; returning -%ENOSPC if there isn't room -* for a complete record. -* -* Return any error condition that results in a short read such as -* -%ENOSPC or -%EFAULT, even though these may be squashed before -* returning to userspace. -*/ - int (*read)(struct i915_perf_stream *stream, - char __user *buf, - size_t count, - size_t *offset); - - /** -* @destroy: Cleanup any stream specific resources. -* -* The stream will always be disabled before this is called. -*/ - void (*destroy)(struct i915_perf_stream *stream); -}; - -/** - * struct i915_perf_stream - state for a single open stream FD - */ -struct i915_perf_stream { - /** -* @dev_priv: i915 drm device -*/ - struct drm_i915_private *dev_priv; - - /** -* @wakeref: As we keep the device awake while the perf stream is -* active, we track our runtime pm reference for later release. -*/ - intel_wakeref_t wakeref; - - /** -* @engine: Engine associated with this performance stream. -*/ - struct intel_engine_cs *engine; - - /** -* @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*` -* properties given when opening a stream, representing the contents -* of a single sample as read() by userspace. -*/ - u32 sample_flags; - - /** -* @sample_size: Considering the configured contents of
[Intel-gfx] [PATCH v16 12/13] drm/i915/perf: allow holding preemption on filtered ctx
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function Signed-off-by: Lionel Landwerlin Reviewed-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 + drivers/gpu/drm/i915/i915_perf.c | 30 +-- drivers/gpu/drm/i915/i915_perf_types.h| 8 + include/uapi/drm/i915_drm.h | 11 +++ 4 files changed, 54 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index ccb5ab542427..230af0f0761a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2128,6 +2128,14 @@ static int eb_oa_config(struct i915_execbuffer *eb) if (err) goto out; + /* +* If the perf stream was opened with hold preemption, flag the +* request properly so that the priority of the request is bumped once +* it reaches the execlist ports. +*/ + if (eb->i915->perf.exclusive_stream->hold_preemption) + eb->request->flags |= I915_REQUEST_NOPREEMPT; + /* * If the config hasn't changed, skip reconfiguring the HW (this is * subject to a delay we want to avoid has much as possible). diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 929ab54ee371..0ffcb8d16154 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -343,6 +343,8 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @hold_preemption: Whether the preemption is disabled for the filtered + * context * @ctx_handle: A gem ctx handle for use with @single_context * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format @@ -357,6 +359,7 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 hold_preemption:1; u64 ctx_handle; /* OA sampling state */ @@ -2631,6 +2634,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (WARN_ON(stream->oa_buffer.format_size == 0)) return -EINVAL; + stream->hold_preemption = props->hold_preemption; + stream->oa_buffer.format = dev_priv->perf.oa_formats[props->oa_format].format; @@ -3190,6 +3195,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, } } + if (props->hold_preemption) { + if (!props->single_context) { + DRM_DEBUG("preemption disable with no context\n"); + ret = -EINVAL; + goto err; + } + privileged_op = true; + } + /* * On Haswell the OA unit supports clock gating off for a specific * context and in this mode there's no visibility of metrics for the @@ -3204,7 +3218,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to * enable the OA unit by default. */ - if (IS_HASWELL(dev_priv) && specific_ctx) + if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) privileged_op = false; /* Similar to perf's kernel.perf_paranoid_cpu sysctl option @@ -3214,7 +3228,7 @@
[Intel-gfx] [PATCH v16 11/13] drm/i915: add a new perf configuration execbuf parameter
We want the ability to dispatch a set of command buffer to the hardware, each with a different OA configuration. To achieve this, we reuse a couple of fields from the execbuf2 struct (I CAN HAZ execbuf3?) to notify what OA configuration should be used for a batch buffer. This requires the process making the execbuf with this flag to also own the perf fd at the time of execbuf. v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris) Move oa_config vma to active (Chris) v3: Don't drop the lock for engine lookup (Chris) Move OA config vma to active before writing the ringbuffer (Chris) v4: Reuse i915_user_extension_fn Serialize requests with OA config updates v5: Check that the chained extension is only present once (Chris) Unpin oa_vma in main path (Chris) v6: Use BIT_ULL (Chris) v7: Hold drm.struct_mutex when serializing the request with OA config (Chris) v8: Remove active request from engine (Lionel) v9: Move fetching OA configuration pass engine pinning (Lionel) Lock VMA before moving to active (Chris) v10: Fix leak on perf_fd (Lionel) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 147 +- drivers/gpu/drm/i915/i915_getparam.c | 4 + include/uapi/drm/i915_drm.h | 39 + 3 files changed, 188 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index e488f22f53a4..ccb5ab542427 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -24,6 +24,7 @@ #include "i915_gem_clflush.h" #include "i915_gem_context.h" #include "i915_gem_ioctls.h" +#include "i915_perf.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -284,7 +285,12 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; + struct drm_i915_gem_execbuffer_ext_perf perf_config; } extensions; + + struct file *perf_file; + struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */ + struct i915_vma *oa_vma; }; #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags]) @@ -1152,6 +1158,58 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma) return err; } + +static int +eb_get_oa_config(struct i915_execbuffer *eb) +{ + struct drm_i915_gem_object *oa_bo; + int err = 0; + + eb->perf_file = NULL; + eb->oa_config = NULL; + eb->oa_vma = NULL; + + if ((eb->extensions.flags & BIT_ULL(DRM_I915_GEM_EXECBUFFER_EXT_PERF)) == 0) + return 0; + + eb->perf_file = fget(eb->extensions.perf_config.perf_fd); + if (!eb->perf_file) + return -EINVAL; + + err = i915_mutex_lock_interruptible(>i915->drm); + if (err) + return err; + + if (eb->perf_file->private_data != eb->i915->perf.exclusive_stream) + err = -EINVAL; + + mutex_unlock(>i915->drm.struct_mutex); + + if (err) + return err; + + if (eb->i915->perf.exclusive_stream->engine != eb->engine) + return -EINVAL; + + err = i915_perf_get_oa_config_and_bo( + eb->i915->perf.exclusive_stream, + eb->extensions.perf_config.oa_config, + >oa_config, _bo); + if (err) + return err; + + eb->oa_vma = i915_vma_instance(oa_bo, + >engine->gt->ggtt->vm, NULL); + i915_gem_object_put(oa_bo); + if (IS_ERR(eb->oa_vma)) { + err = PTR_ERR(eb->oa_vma); + eb->oa_vma = NULL; + return err; + } + + return 0; +} + static int __reloc_gpu_alloc(struct i915_execbuffer *eb, struct i915_vma *vma, unsigned int len) @@ -2051,6 +2109,54 @@ add_to_client(struct i915_request *rq, struct drm_file *file) spin_unlock(_priv->mm.lock); } +static int eb_oa_config(struct i915_execbuffer *eb) +{ + struct i915_perf_stream *perf_stream; + int err; + + if (!eb->oa_config) + return 0; + + perf_stream = eb->perf_file->private_data; + + err = mutex_lock_interruptible(_stream->config_mutex); + if (err) + return err; + + err = i915_active_request_set(_stream->active_config_rq, + eb->request); + if (err) + goto out; + + /* +* If the config hasn't changed, skip reconfiguring the HW (this is +* subject to a delay we want to avoid has much as possible). +*/ + if (eb->oa_config == perf_stream->oa_config) + goto out; + + i915_vma_lock(eb->oa_vma); +
[Intel-gfx] [PATCH v16 00/13] drm/i915: Vulkan performance query support
Hi all, This is just a few compilation fixes only seen on CI. Cheers, Lionel Landwerlin (13): drm/i915: introduce a mechanism to extend execbuf2 drm/i915: add syncobj timeline support drm/i915/perf: drop list of streams drm/i915/perf: store the associated engine of a stream drm/i915/perf: introduce a versioning of the i915-perf uapi drm/i915/perf: move perf types to their own header drm/i915/perf: allow for CS OA configs to be created lazily drm/i915/perf: implement active wait for noa configurations drm/i915: add wait flags to i915_active_request_retire drm/i915/perf: execute OA configuration from command stream drm/i915: add a new perf configuration execbuf parameter drm/i915/perf: allow holding preemption on filtered ctx drm/i915: add support for perf configuration queries .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 501 ++-- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 25 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 5 + drivers/gpu/drm/i915/i915_active.c| 4 +- drivers/gpu/drm/i915/i915_active.h| 5 +- drivers/gpu/drm/i915/i915_debugfs.c | 30 + drivers/gpu/drm/i915/i915_drv.c | 3 +- drivers/gpu/drm/i915/i915_drv.h | 313 +--- drivers/gpu/drm/i915/i915_getparam.c | 9 + drivers/gpu/drm/i915/i915_perf.c | 719 +++--- drivers/gpu/drm/i915/i915_perf.h | 29 + drivers/gpu/drm/i915/i915_perf_types.h| 367 + drivers/gpu/drm/i915/i915_query.c | 282 +++ drivers/gpu/drm/i915/i915_reg.h | 4 +- include/uapi/drm/i915_drm.h | 196 - 15 files changed, 2012 insertions(+), 480 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_perf_types.h -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v16 02/13] drm/i915: add syncobj timeline support
Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) Signed-off-by: Lionel Landwerlin --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 307 ++ drivers/gpu/drm/i915/i915_drv.c | 3 +- drivers/gpu/drm/i915/i915_getparam.c | 1 + include/uapi/drm/i915_drm.h | 39 +++ 4 files changed, 293 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 4f5fd946ab28..e488f22f53a4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -214,6 +214,13 @@ enum { * the batchbuffer in trusted mode, otherwise the ioctl is rejected. */ +struct i915_eb_fences { + struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */ + struct dma_fence *dma_fence; + u64 value; + struct dma_fence_chain *chain_fence; +}; + struct i915_execbuffer { struct drm_i915_private *i915; /** i915 backpointer */ struct drm_file *file; /** per-file lookup tables and limits */ @@ -276,6 +283,7 @@ struct i915_execbuffer { struct { u64 flags; /** Available extensions parameters */ + struct drm_i915_gem_execbuffer_ext_timeline_fences timeline_fences; } extensions; }; @@ -2320,67 +2328,217 @@ eb_pin_engine(struct i915_execbuffer *eb, } static void -__free_fence_array(struct drm_syncobj **fences, unsigned int n) +__free_fence_array(struct i915_eb_fences *fences, unsigned int n) { - while (n--) - drm_syncobj_put(ptr_mask_bits(fences[n], 2)); + while (n--) { + drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2)); + dma_fence_put(fences[n].dma_fence); + kfree(fences[n].chain_fence); + } kvfree(fences); } -static struct drm_syncobj ** -get_fence_array(struct drm_i915_gem_execbuffer2 *args, - struct drm_file *file) +static struct i915_eb_fences * +get_timeline_fence_array(struct i915_execbuffer *eb, int *out_n_fences) +{ + struct drm_i915_gem_execbuffer_ext_timeline_fences *timeline_fences = + >extensions.timeline_fences; + struct drm_i915_gem_exec_fence __user *user_fences; + struct i915_eb_fences *fences; + u64 __user *user_values; + u64 num_fences, num_user_fences = timeline_fences->fence_count; + unsigned long n; + int err; + + /* Check multiplication overflow for access_ok() and kvmalloc_array() */ + BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long)); + if (num_user_fences > min_t(unsigned long, + ULONG_MAX / sizeof(*user_fences), + SIZE_MAX / sizeof(*fences))) + return ERR_PTR(-EINVAL); + + user_fences = u64_to_user_ptr(timeline_fences->handles_ptr); + if (!access_ok(user_fences, num_user_fences * sizeof(*user_fences))) + return ERR_PTR(-EFAULT); + + user_values = u64_to_user_ptr(timeline_fences->values_ptr); + if (!access_ok(user_values, num_user_fences * sizeof(*user_values))) + return ERR_PTR(-EFAULT); + + fences = kvmalloc_array(num_user_fences, sizeof(*fences), + __GFP_NOWARN | GFP_KERNEL); + if (!fences) + return ERR_PTR(-ENOMEM); + + BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) & +~__I915_EXEC_FENCE_UNKNOWN_FLAGS); + + for (n = 0, num_fences = 0; n < timeline_fences->fence_count; n++) { + struct drm_i915_gem_exec_fence user_fence; + struct drm_syncobj *syncobj; + struct dma_fence *fence = NULL; + u64 point; + + if (__copy_from_user(_fence, user_fences++, sizeof(user_fence))) { + err = -EFAULT; + goto err; + } + + if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) { + err = -EINVAL; + goto err; + } + + if (__get_user(point, user_values++)) { + err = -EFAULT; + goto err; + } + + syncobj
[Intel-gfx] [PATCH v16 03/13] drm/i915/perf: drop list of streams
At some point in time there was the idea that we could have multiple stream from the same piece of HW but that never materialized and given the hard time we already have making everything work with the submission side, there is no real point having this list of 1 element around. Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 6 -- drivers/gpu/drm/i915/i915_perf.c | 16 +--- 2 files changed, 1 insertion(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index db7480831e52..75607450ba00 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1082,11 +1082,6 @@ struct i915_perf_stream { */ struct drm_i915_private *dev_priv; - /** -* @link: Links the stream into ``_i915_private->streams`` -*/ - struct list_head link; - /** * @wakeref: As we keep the device awake while the perf stream is * active, we track our runtime pm reference for later release. @@ -1671,7 +1666,6 @@ struct drm_i915_private { * except exclusive_stream. */ struct mutex lock; - struct list_head streams; /* * The stream currently using the OA unit. If accessed diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c1b764233761..d18cd332afb7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1435,9 +1435,6 @@ static void gen7_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -1494,10 +1491,6 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) */ memset(stream->oa_buffer.vaddr, 0, OA_BUFFER_SIZE); - /* -* Maybe make ->pollin per-stream state if we support multiple -* concurrent streams in the future. -*/ stream->pollin = false; } @@ -2633,8 +2626,6 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) if (stream->ops->destroy) stream->ops->destroy(stream); - list_del(>link); - if (stream->ctx) i915_gem_context_put(stream->ctx); @@ -2783,8 +2774,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, goto err_flags; } - list_add(>link, _priv->perf.streams); - if (param->flags & I915_PERF_FLAG_FD_CLOEXEC) f_flags |= O_CLOEXEC; if (param->flags & I915_PERF_FLAG_FD_NONBLOCK) @@ -2793,7 +2782,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, stream_fd = anon_inode_getfd("[i915_perf]", , stream, f_flags); if (stream_fd < 0) { ret = stream_fd; - goto err_open; + goto err_flags; } if (!(param->flags & I915_PERF_FLAG_DISABLED)) @@ -2806,8 +2795,6 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, return stream_fd; -err_open: - list_del(>link); err_flags: if (stream->ops->destroy) stream->ops->destroy(stream); @@ -3643,7 +3630,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv) } if (dev_priv->perf.ops.enable_metric_set) { - INIT_LIST_HEAD(_priv->perf.streams); mutex_init(_priv->perf.lock); oa_sample_rate_hard_limit = 1000 * -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/tgl: Implement Wa_1409142259
On 9/6/19 6:10 PM, Matt Roper wrote: On Fri, Sep 06, 2019 at 03:46:42PM -0700, Daniele Ceraolo Spurio wrote: On 9/6/19 3:41 PM, Radhakrishna Sripada wrote: Disable CPS aware color pipe by setting chicken bit. BSpec: 52890 HSDES: 1409142259 Cc: Stuart Summers Cc: Matt Roper Signed-off-by: Radhakrishna Sripada --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 + drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 243d3f77be13..14e3f9677b06 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -894,6 +894,11 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) static void tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) { + wa_init_mcr(i915, wal); this is not part of the WA you're trying to apply, right? + + /* Wa_1409142259 */ + WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3, + GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); AFAICS the register is part of the render context, so shouldn't we set this as part of the ctx_workarounds? that's what we do for another WA on the same register on ICL. How do you usually determine if a register is part of the context or not? This one doesn't have the "This Register is saved and restored as part of Context" notation that other context registers have, so is there somewhere else we're supposed to find that information? Most of the context registers are not tagged that way. The golden reference for what's in the context is the context image page (Bspec 46255 for TGL). Daniele Matt Daniele } static void diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 006cffd56be2..53e07882efb7 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7668,6 +7668,7 @@ enum { #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC (1 << 11) + #define GEN12_DISABLE_CPS_AWARE_COLOR_PIPE (1 << 9) #define HIZ_CHICKEN _MMIO(0x7018) # define CHV_HZ_8X8_MODE_IN_1X (1 << 15) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] tests/kms_rotation_crc: Switch to one-shot CRC collection
kms_rotation_crc manually starts and stops CRC collection and reads single CRC values when it needs them. Depending on how long the other test setup and execution operations take, the CRC buffer (128 entries) can fill up CRC values that the test never reads or uses. Our CI system has stumbled over several cases where the buffer fills up and overflows due to this. Let's switch this test over to the igt_pipe_crc_collect_crc API which will handle the start+stop of CRC collection when a single CRC is needed so that we won't collect a bunch of unwanted CRC values and run the risk of overflow. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105127 Signed-off-by: Matt Roper --- tests/kms_rotation_crc.c | 22 -- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/tests/kms_rotation_crc.c b/tests/kms_rotation_crc.c index 668c1732..8f36fd2f 100644 --- a/tests/kms_rotation_crc.c +++ b/tests/kms_rotation_crc.c @@ -167,7 +167,7 @@ static void cleanup_crtc(data_t *data) } static void prepare_crtc(data_t *data, igt_output_t *output, enum pipe pipe, -igt_plane_t *plane, bool start_crc) +igt_plane_t *plane) { igt_display_t *display = >display; @@ -181,9 +181,6 @@ static void prepare_crtc(data_t *data, igt_output_t *output, enum pipe pipe, igt_display_commit2(display, COMMIT_ATOMIC); data->pipe_crc = igt_pipe_crc_new(data->gfx_fd, pipe, INTEL_PIPE_CRC_SOURCE_AUTO); - - if (start_crc) - igt_pipe_crc_start(data->pipe_crc); } enum rectangle_type { @@ -263,7 +260,7 @@ static void prepare_fbs(data_t *data, igt_output_t *output, igt_plane_set_position(plane, data->pos_x, data->pos_y); igt_display_commit2(display, COMMIT_ATOMIC); - igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, >flip_crc); + igt_pipe_crc_collect_crc(data->pipe_crc, >flip_crc); /* * Prepare the non-rotated flip fb. @@ -286,7 +283,7 @@ static void prepare_fbs(data_t *data, igt_output_t *output, igt_plane_set_position(plane, data->pos_x, data->pos_y); igt_display_commit2(display, COMMIT_ATOMIC); - igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, >ref_crc); + igt_pipe_crc_collect_crc(data->pipe_crc, >ref_crc); /* * Prepare the non-rotated reference fb. @@ -336,7 +333,7 @@ static void test_single_case(data_t *data, enum pipe pipe, igt_assert_eq(ret, 0); /* Check CRC */ - igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, _output); + igt_pipe_crc_collect_crc(data->pipe_crc, _output); igt_assert_crc_equal(>ref_crc, _output); /* @@ -359,7 +356,7 @@ static void test_single_case(data_t *data, enum pipe pipe, igt_assert_eq(ret, 0); } kmstest_wait_for_pageflip(data->gfx_fd); - igt_pipe_crc_get_current(display->drm_fd, data->pipe_crc, _output); + igt_pipe_crc_collect_crc(data->pipe_crc, _output); igt_assert_crc_equal(>flip_crc, _output); } @@ -388,7 +385,7 @@ static void test_plane_rotation(data_t *data, int plane_type, bool test_bad_form plane = igt_output_get_plane_type(output, plane_type); igt_require(igt_plane_has_prop(plane, IGT_PLANE_ROTATION)); - prepare_crtc(data, output, pipe, plane, true); + prepare_crtc(data, output, pipe, plane); for (i = 0; i < num_rectangle_types; i++) { /* Unsupported on i915 */ @@ -416,7 +413,6 @@ static void test_plane_rotation(data_t *data, int plane_type, bool test_bad_form data->override_fmt, test_bad_format); } } - igt_pipe_crc_stop(data->pipe_crc); } } @@ -473,7 +469,7 @@ static bool get_multiplane_crc(data_t *data, igt_output_t *output, ret = igt_display_try_commit2(display, COMMIT_ATOMIC); igt_assert_eq(ret, 0); - igt_pipe_crc_get_current(data->gfx_fd, data->pipe_crc, crc_output); + igt_pipe_crc_collect_crc(data->pipe_crc, crc_output); for (c = 0; c < numplanes && oldplanes; c++) igt_remove_fb(data->gfx_fd, [c].fb); @@ -564,7 +560,6 @@ static void test_multi_plane_rotation(data_t *data, enum pipe pipe) data->pipe_crc = igt_pipe_crc_new(data->gfx_fd, pipe, INTEL_PIPE_CRC_SOURCE_AUTO); - igt_pipe_crc_start(data->pipe_crc); for (i = 0; i < ARRAY_SIZE(planeconfigs); i++) { p[0].planetype = DRM_PLANE_TYPE_PRIMARY; @@ -620,7 +615,6 @@ static void test_multi_plane_rotation(data_t *data, enum pipe pipe) } }
Re: [Intel-gfx] [PULL] gvt-next-fixes
Hi guys, On Fri, Sep 06, 2019 at 01:42:55PM +0800, Zhenyu Wang wrote: > > Hi, > > Here's gvt-next-fixes with two recent fixes, one for recent > guest hang regression and another for guest reset fix. > > Thanks. > -- > The following changes since commit c36beba6b296b3c05a0f29753b04775e5ae23886: > > drm/i915: Seal races between async GPU cancellation, retirement and > signaling (2019-05-13 13:53:35 +0300) > > are available in the Git repository at: > > https://github.com/intel/gvt-linux.git tags/gvt-next-fixes-2019-09-06 > > for you to fetch changes up to 4a5322560aa235efa84c0aa34c00e5749a0792fd: > > drm/i915/gvt: update RING_START reg of vGPU when the context is submitted > to i915 (2019-09-06 13:39:09 +0800) $ dim pull-request-next-fixes Using drm/drm-next as the upstream dim: 4a5322560aa2 ("drm/i915/gvt: update RING_START reg of vGPU when the context is submitted to i915"): Link tag missing. dim: 0a3242bdb477 ("drm/i915/gvt: update vgpu workload head pointer correctly"): Link tag missing. dim: ERROR: issues in commits detected, aborting I wonder how I should proceed here. In the past I was always bypasssing dim, but now that drm maintainers also use dim I'm sure this will blow up there anyways. But gvt patches are not tracked on our CI individually hence they don't have Links. Jani, Joonas, how are you guys handling this? Daniel, Dave, ideas? Thanks, Rodrigo. > > > gvt-next-fixes-2019-09-06 > > - Fix guest context head pointer update for hang (Xiaolin) > - Fix guest context ring state for reset (Weinan) > > > Weinan Li (1): > drm/i915/gvt: update RING_START reg of vGPU when the context is > submitted to i915 > > Xiaolin Zhang (1): > drm/i915/gvt: update vgpu workload head pointer correctly > > drivers/gpu/drm/i915/gvt/scheduler.c | 45 > +--- > 1 file changed, 32 insertions(+), 13 deletions(-) > > > -- > Open Source Technology Center, Intel ltd. > > $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/ringbuffer: Flush writes before RING_TAIL update
== Series Details == Series: drm/i915/ringbuffer: Flush writes before RING_TAIL update URL : https://patchwork.freedesktop.org/series/66426/ State : success == Summary == CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14327 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/ Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_14327: ### IGT changes ### Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@gem_sync@basic-each: - {fi-tgl-u}: NOTRUN -> [INCOMPLETE][1] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-tgl-u/igt@gem_s...@basic-each.html Known issues Here are the changes found in Patchwork_14327 that come from known issues: ### IGT changes ### Issues hit * igt@i915_selftest@live_gem_contexts: - fi-skl-iommu: [PASS][2] -> [INCOMPLETE][3] ([fdo#111519]) [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-skl-iommu/igt@i915_selftest@live_gem_contexts.html [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-skl-iommu/igt@i915_selftest@live_gem_contexts.html Possible fixes * igt@gem_ctx_switch@legacy-render: - fi-icl-u2: [INCOMPLETE][4] ([fdo#107713] / [fdo#111381]) -> [PASS][5] [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html * igt@gem_exec_gttfill@basic: - {fi-tgl-u}: [INCOMPLETE][6] ([fdo#111593]) -> [PASS][7] [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-tgl-u/igt@gem_exec_gttf...@basic.html [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/fi-tgl-u/igt@gem_exec_gttf...@basic.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713 [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381 [fdo#111519]: https://bugs.freedesktop.org/show_bug.cgi?id=111519 [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593 Participating hosts (51 -> 47) -- Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_6854 -> Patchwork_14327 CI-20190529: 20190529 CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_14327: cda3b809297cb7c5b44e6c9abe22cc4b7516a98d @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == cda3b809297c drm/i915/ringbuffer: Flush writes before RING_TAIL update == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14327/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel
== Series Details == Series: series starting with [1/6] drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel URL : https://patchwork.freedesktop.org/series/66425/ State : failure == Summary == CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14326 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_14326 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_14326, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/ Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_14326: ### IGT changes ### Possible regressions * igt@gem_exec_gttfill@basic: - fi-apl-guc: [PASS][1] -> [DMESG-WARN][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-apl-guc/igt@gem_exec_gttf...@basic.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/fi-apl-guc/igt@gem_exec_gttf...@basic.html Known issues Here are the changes found in Patchwork_14326 that come from known issues: ### IGT changes ### Possible fixes * igt@gem_ctx_switch@legacy-render: - fi-icl-u2: [INCOMPLETE][3] ([fdo#107713] / [fdo#111381]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167 [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713 [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381 Participating hosts (51 -> 46) -- Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 Missing(8): fi-ilk-m540 fi-tgl-u fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_6854 -> Patchwork_14326 CI-20190529: 20190529 CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_14326: c0df1c601b3cffed51bfebf09ddeeea08ff26fb2 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == c0df1c601b3c iommu/intel: Ignore igfx_off 0462ef926b65 iommu/intel: Declare Broadwell igfx dmar support snafu 186afc9aaa54 drm/i915: Force compilation with intel-iommu for CI validation 6be90a9b332d drm/i915: Perform GGTT restore much earlier during resume e54488ac4cd7 drm/i915/selftests: Tighten the timeout testing for partial mmaps 7386b0ebc23c drm/i915/selftests: Take runtime wakeref for igt_ggtt_lowlevel == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14326/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] USB-C
Hello, 5.2.13 is working fine (great) still with a Dell U4919DW connected via USB-C from a X1 Carbon Gen 6. 5.3-rc8 so far is not (blank screen) and errors: https://pastebin.com/tXFi6AfK Seems there has been some refactoring for just this kind of connection in 5.3? Is there perhaps and issue since for this scenario or are other components at fault perhaps (ACPI / mutter)? Thanks! ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for adding gamma state checker for CHV and i965 (rev2)
== Series Details == Series: adding gamma state checker for CHV and i965 (rev2) URL : https://patchwork.freedesktop.org/series/66297/ State : warning == Summary == $ dim checkpatch origin/drm-tip d02e018c92fe drm/i915/display: Add gamma precision function for CHV 73751624de48 drm/i915/display: Extract i965_read_luts() -:22: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #22: -Renamed i965_read_gamma_lut_10p6() to i965_read_lut_10p6() [Ville, Uma] total: 0 errors, 1 warnings, 0 checks, 78 lines checked 1eeae34e288a drm/i915/display: Extract chv_read_luts() -:57: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #57: FILE: drivers/gpu/drm/i915/display/intel_color.c:1642: + blob_data[i].green = intel_color_lut_pack(REG_FIELD_GET( -:59: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #59: FILE: drivers/gpu/drm/i915/display/intel_color.c:1644: + blob_data[i].blue = intel_color_lut_pack(REG_FIELD_GET( -:63: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #63: FILE: drivers/gpu/drm/i915/display/intel_color.c:1648: + blob_data[i].red = intel_color_lut_pack(REG_FIELD_GET( total: 0 errors, 0 warnings, 3 checks, 64 lines checked ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] USB-C
Searching on the kernel warnings and errors didn't bring it up, but in browsing bugzilla I stumbled on this. Bug 111501 - [CFL] Kernel 5.3.0-rc6: i915 fails at typec_displayport 5120x1440 https://bugs.freedesktop.org/show_bug.cgi?id=111501 It's the same monitor and connection type. Is the related patchset intended for 5.3 then? https://patchwork.freedesktop.org/series/66286/ Thanks On Mon, Sep 9, 2019, at 10:06 AM, nnet wrote: > Hello, > > 5.2.13 is working fine (great) still with a Dell U4919DW connected via > USB-C from a X1 Carbon Gen 6. > > 5.3-rc8 so far is not (blank screen) and errors: > > https://pastebin.com/tXFi6AfK > > Seems there has been some refactoring for just this kind of connection in 5.3? > > Is there perhaps and issue since for this scenario or are other > components at fault perhaps (ACPI / mutter)? > > Thanks! > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: export color_differs
== Series Details == Series: series starting with [1/3] drm/i915: export color_differs URL : https://patchwork.freedesktop.org/series/66433/ State : success == Summary == CI Bug Log - changes from CI_DRM_6854 -> Patchwork_14329 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/ Known issues Here are the changes found in Patchwork_14329 that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_suspend@basic-s4-devices: - fi-blb-e6850: [PASS][1] -> [INCOMPLETE][2] ([fdo#107718]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-blb-e6850/igt@gem_exec_susp...@basic-s4-devices.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/fi-blb-e6850/igt@gem_exec_susp...@basic-s4-devices.html Possible fixes * igt@gem_ctx_switch@legacy-render: - fi-icl-u2: [INCOMPLETE][3] ([fdo#107713] / [fdo#111381]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6854/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/fi-icl-u2/igt@gem_ctx_swi...@legacy-render.html [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713 [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718 [fdo#111381]: https://bugs.freedesktop.org/show_bug.cgi?id=111381 Participating hosts (51 -> 47) -- Additional (3): fi-icl-dsi fi-cfl-guc fi-icl-u3 Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_6854 -> Patchwork_14329 CI-20190529: 20190529 CI_DRM_6854: 5a70800ed2837e2d35a331e2cfd43a55df58c4fc @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5176: 0102dcf4e2e8b357b59173fe1ff78069148080c6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_14329: 1a4449483168a24176fca0b460c1d6bb69ed1121 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 1a4449483168 drm/i915: cleanup cache-coloring 5169d6ad3a55 drm/i915: s/i915_gtt_color_adjust/i915_ggtt_color_adjust 1480a89b6d8b drm/i915: export color_differs == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14329/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/6] drm/i915: Force compilation with intel-iommu for CI validation
Hi Chris, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [cannot apply to v5.3-rc8 next-20190904] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-selftests-Take-runtime-wakeref-for-igt_ggtt_lowlevel/20190909-201355 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-s2-201936 (attached as .config) compiler: gcc-7 (Debian 7.4.0-11) 7.4.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot All error/warnings (new ones prefixed by >>): drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_coherency': >> drivers/iommu/intel-iommu.c:622:2: error: implicit declaration of function >> 'for_each_active_iommu'; did you mean 'for_each_active_irq'? >> [-Werror=implicit-function-declaration] for_each_active_iommu(iommu, drhd) { ^ for_each_active_irq >> drivers/iommu/intel-iommu.c:622:37: error: expected ';' before '{' token for_each_active_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_snooping': drivers/iommu/intel-iommu.c:638:37: error: expected ';' before '{' token for_each_active_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c: In function 'domain_update_iommu_superpage': drivers/iommu/intel-iommu.c:663:37: error: expected ';' before '{' token for_each_active_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c: In function 'device_to_iommu': drivers/iommu/intel-iommu.c:781:37: error: expected ';' before '{' token for_each_active_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c:812:2: warning: label 'out' defined but not used [-Wunused-label] out: ^~~ drivers/iommu/intel-iommu.c:756:6: warning: unused variable 'i' [-Wunused-variable] int i; ^ drivers/iommu/intel-iommu.c:753:17: warning: unused variable 'tmp' [-Wunused-variable] struct device *tmp; ^~~ drivers/iommu/intel-iommu.c: In function 'si_domain_init': >> drivers/iommu/intel-iommu.c:2731:3: error: implicit declaration of function >> 'for_each_active_dev_scope'; did you mean 'for_each_active_irq'? >> [-Werror=implicit-function-declaration] for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, ^ for_each_active_irq drivers/iommu/intel-iommu.c:2732:16: error: expected ';' before '{' token i, dev) { ^ drivers/iommu/intel-iommu.c: In function 'device_has_rmrr': >> drivers/iommu/intel-iommu.c:2794:4: error: expected ';' before 'if' if (tmp == dev || ^~ drivers/iommu/intel-iommu.c: In function 'init_dmars': >> drivers/iommu/intel-iommu.c:3157:2: error: implicit declaration of function >> 'for_each_drhd_unit'; did you mean 'for_each_rmrr_units'? >> [-Werror=implicit-function-declaration] for_each_drhd_unit(drhd) { ^~ for_each_rmrr_units drivers/iommu/intel-iommu.c:3157:27: error: expected ';' before '{' token for_each_drhd_unit(drhd) { ^ drivers/iommu/intel-iommu.c:3182:2: error: implicit declaration of function 'for_each_iommu'; did you mean 'for_each_cpu'? [-Werror=implicit-function-declaration] for_each_iommu(iommu, drhd) { ^~ for_each_cpu drivers/iommu/intel-iommu.c:3182:30: error: expected ';' before '{' token for_each_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c:3293:30: error: expected ';' before '{' token for_each_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c:3327:37: error: expected ';' before '{' token for_each_active_iommu(iommu, drhd) { ^ drivers/iommu/intel-iommu.c: In function 'get_private_domain_for_dev': drivers/iommu/intel-iommu.c:3391:18: error: expected ';' before '{' token i, i_dev) { ^ drivers/iommu/intel-iommu.c:3376:9: warning: unused variable 'ret' [-Wunused-variable] int i, ret; ^~~ In file included from arch/x86/include/asm/bug.h:83:0, from include/linux/bug.h:5, from include/linux/jump_label.h:250, from arch/x86/include/asm/string_64.h:6, from arch/x86/include/asm/string.h:5, from include/linux/string.h:20, from include/linux/bitmap.h:9, fr