Re: [Intel-gfx] [PATCH] drm/i915: Don't deref pipe-cpu_transcoder in the hangcheck code
On Thu, Aug 08, 2013 at 02:44:23PM +0100, Chris Wilson wrote: On Thu, Aug 08, 2013 at 03:12:06PM +0200, Daniel Vetter wrote: From: Chris Wilson ch...@chris-wilson.co.uk If we get an error event really early in the driver setup sequence, which gen3 is especially prone to with various display GTT faults we Oops. So try to avoid this. Additionally with Haswell the transcoders are a separate bank of registers from the pipes (4 transcoders, 3 pipes). In event of an error, we want to be sure we have a complete and accurate picture of the machine state, so record all the transcoders in addition to all the active pipes. This regression has been introduced in commit 702e7a56af3780d8b3a717f698209bef44187bb0 Author: Paulo Zanoni paulo.r.zan...@intel.com Date: Tue Oct 23 18:29:59 2012 -0200 drm/i915: convert PIPECONF to use transcoder instead of pipe Based on the patch drm/i915: Dump all transcoder registers on error from Chris Wilson: v2: Rebase so that we don't try to be clever and try to figure out the cpu transcoder from hw state. That exercise should be done when we analyze the error state offline. The actual bugfix is to not call intel_pipe_to_cpu_transcoder in the error state capture code in case the pipes aren't fully set up yet. v3: Simplifiy the err-num_transcoders computation a bit. While at it make the error capture stuff save on systems without a display block. v4: Fix fail, spotted by Jani. v5: Completely new commit message, cc: stable. Cc: Paulo Zanoni paulo.r.zan...@intel.com Cc: Damien Lespiau damien.lesp...@intel.com Cc: Jani Nikula jani.nik...@intel.com Cc: Chris Wilson ch...@chris-wilson.co.uk Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60021 Cc: sta...@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Lgtm. We may have to modify transcoders[] to be more dynamic in future, but for now this works. Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk Picked up for -fixes, thanks for the review. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Dead code cleanups
On Thu, Aug 08, 2013 at 10:28:52PM +0100, Damien Lespiau wrote: Procrastinating... at least a small consolation prize: 5 files changed, 11 insertions(+), 104 deletions(-) Queued for -next, thanks for the patches. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: Eliminate unrequired dwords for MI_SET_CONTEXT
On Thu, Aug 08, 2013 at 08:00:25PM +0100, Chris Wilson wrote: A later patch adds yet another workaround for MI_SET_CONTEXT, at which point we start to end up with more NOOPs than actual command dwords along the non-workaround paths. It is time that we made the MI_SET_CONTEXT a variable length block and only emit the dwords we truly need. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Ben Widawsky b...@bwidawsk.net Since this seems to be based on top of gen5 context support: Care to slap an r-b and tested-by onto it so that I can merge the entire shebang? Cheers, Daniel --- drivers/gpu/drm/i915/i915_gem_context.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 879bfa2..8a7b61e 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -348,7 +348,7 @@ mi_set_context(struct intel_ring_buffer *ring, struct i915_hw_context *new_context, u32 hw_flags) { - int ret; + int ret, len; /* w/a: If Flush TLB Invalidation Mode is enabled, driver must do a TLB * invalidation prior to MI_SET_CONTEXT. On GEN6 we don't set the value @@ -361,7 +361,14 @@ mi_set_context(struct intel_ring_buffer *ring, return ret; } - ret = intel_ring_begin(ring, 6); + len = 4; + switch (INTEL_INFO(ring-dev)-gen) { + case 7: + case 5: len += 2; + break; + } + + ret = intel_ring_begin(ring, len); if (ret) return ret; @@ -373,9 +380,6 @@ mi_set_context(struct intel_ring_buffer *ring, case 5: intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN); break; - default: - intel_ring_emit(ring, MI_NOOP); - break; } intel_ring_emit(ring, MI_NOOP); @@ -395,9 +399,6 @@ mi_set_context(struct intel_ring_buffer *ring, case 5: intel_ring_emit(ring, MI_SUSPEND_FLUSH); break; - default: - intel_ring_emit(ring, MI_NOOP); - break; } intel_ring_advance(ring); -- 1.8.4.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [patch] drm/i915: precedence bug in hsw_compute_wm_results()
The '!' operation has higher precedence than the compare so probably this test is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index a5a9959..0b9d9a7 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2562,7 +2562,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val = lp_maximums-fbc) { results-enable_fbc_wm = false; lp_results[level - 1].fbc_val = 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [patch] drm/i915: unbreak i915_gem_object_ggtt_unbind()
There is an extra semi-colon here so we just leak and never unbind anything. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 735f43f..a582540 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2656,7 +2656,7 @@ i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj) struct drm_i915_private *dev_priv = obj-base.dev-dev_private; struct i915_address_space *ggtt = dev_priv-gtt.base; - if (!i915_gem_obj_ggtt_bound(obj)); + if (!i915_gem_obj_ggtt_bound(obj)) return 0; if (obj-pin_count) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [patch] drm/i915: precedence bug in hsw_compute_wm_results()
On Fri, Aug 09, 2013 at 12:43:02PM +0300, Dan Carpenter wrote: The '!' operation has higher precedence than the compare so probably this test is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index a5a9959..0b9d9a7 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2562,7 +2562,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val = lp_maximums-fbc) { I didn't spot that '!' at all. It's a stray and should be just removed. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [patch] drm/i915: precedence bug in hsw_compute_wm_results()
On Fri, Aug 09, 2013 at 12:43:02PM +0300, Dan Carpenter wrote: The '!' operation has higher precedence than the compare so probably this test is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index a5a9959..0b9d9a7 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2562,7 +2562,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val = lp_maximums-fbc) { Whoops. My bad. It should still be but ! should just be dropped. results-enable_fbc_wm = false; lp_results[level - 1].fbc_val = 0; } ___ dri-devel mailing list dri-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: Eliminate unrequired dwords for MI_SET_CONTEXT
On Fri, Aug 09, 2013 at 11:41:26AM +0200, Daniel Vetter wrote: On Thu, Aug 08, 2013 at 08:00:25PM +0100, Chris Wilson wrote: A later patch adds yet another workaround for MI_SET_CONTEXT, at which point we start to end up with more NOOPs than actual command dwords along the non-workaround paths. It is time that we made the MI_SET_CONTEXT a variable length block and only emit the dwords we truly need. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Ben Widawsky b...@bwidawsk.net Since this seems to be based on top of gen5 context support: Care to slap an r-b and tested-by onto it so that I can merge the entire shebang? Sure, there was nothing outrageous in the enabling patches, and I can confirm that in the end none of the symptoms I had were related to context enabling (just the machine dying as it turns out). So for the ilk ctx patches: The i915_gem_obj_ggtt_pin() changes are ultimately bogus, that is drm/i915: Make ILK context objects more like the others and drm/i915: Restore ILK powerctx pin attributes cancel each other out and can quite happily die. What would be nice would be to use i915_gem_object_ggtt_pin(ctx, 0, false, false); instead. All the other patches look good, so Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk And a Tested-by: Chris Wilson ch...@chris-wilson.co.uk # X still works! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [patch] drm/i915: unbreak i915_gem_object_ggtt_unbind()
On Fri, Aug 09, 2013 at 12:44:11PM +0300, Dan Carpenter wrote: There is an extra semi-colon here so we just leak and never unbind anything. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 735f43f..a582540 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2656,7 +2656,7 @@ i915_gem_object_ggtt_unbind(struct drm_i915_gem_object *obj) struct drm_i915_private *dev_priv = obj-base.dev-dev_private; struct i915_address_space *ggtt = dev_priv-gtt.base; - if (!i915_gem_obj_ggtt_bound(obj)); + if (!i915_gem_obj_ggtt_bound(obj)) Oops, thanks for spotting this. Applied to dinq. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [patch v2] drm/i915: fix a limit check in hsw_compute_wm_results()
The '!' here was not intended. Since '!' has higher precedence than compare, it means the check is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com --- v2: My first patch was wrong. diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 96234c6..0f5eb21 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2685,7 +2685,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val lp_maximums-fbc) { results-enable_fbc_wm = false; lp_results[level - 1].fbc_val = 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [patch v2] drm/i915: fix a limit check in hsw_compute_wm_results()
On Fri, Aug 09, 2013 at 01:07:31PM +0300, Dan Carpenter wrote: The '!' here was not intended. Since '!' has higher precedence than compare, it means the check is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com The culprit for Daniel: commit 71fff20ff1bb790f4defe0c880e028581ffab420 Author: Ville Syrjälä ville.syrj...@linux.intel.com Date: Tue Aug 6 22:24:03 2013 +0300 drm/i915: Kill fbc_enable from hsw_lp_wm_results --- v2: My first patch was wrong. diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 96234c6..0f5eb21 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2685,7 +2685,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val lp_maximums-fbc) { results-enable_fbc_wm = false; lp_results[level - 1].fbc_val = 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Track when an object is pinned for use by the display engine
The display engine has unique coherency rules such that it requires special handling to ensure that all writes to cursors, scanouts and sprites are clflushed. This patch introduces the infrastructure to simply track when an object is being accessed by the display engine. v2: Explain the is_pin_display() magic as the sources for obj-pin_count and their individual rules is not obvious. (Ville) Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 36 ++-- drivers/gpu/drm/i915/intel_display.c | 8 4 files changed, 42 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b06e6ce..463c660 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -117,6 +117,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, (name: %d), obj-base.name); if (obj-pin_count) seq_printf(m, (pinned x %d), obj-pin_count); + if (obj-pin_display) + seq_printf(m, (display)); if (obj-fence_reg != I915_FENCE_REG_NONE) seq_printf(m, (fence: %d), obj-fence_reg); list_for_each_entry(vma, obj-vma_list, vma_link) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 957f22e..a6a1cc3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1379,6 +1379,7 @@ struct drm_i915_gem_object { */ unsigned int fault_mappable:1; unsigned int pin_mappable:1; + unsigned int pin_display:1; /* * Is the GPU currently using a fence to access this buffer, @@ -1888,6 +1889,7 @@ int __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 alignment, struct intel_ring_buffer *pipelined); +void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj); int i915_gem_attach_phys_object(struct drm_device *dev, struct drm_i915_gem_object *obj, int id, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 065f927..3880f05 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3583,6 +3583,22 @@ unlock: return ret; } +static bool is_pin_display(struct drm_i915_gem_object *obj) +{ + /* There are 3 sources that pin objects: +* 1. The display engine (scanouts, sprites, cursors); +* 2. Reservations for execbuffer; +* 3. The user. +* +* We can ignore reservations as we hold the struct_mutex and +* are only called outside of the reservation path. The user +* can only increment pin_count once, and so if after +* subtracting the potential reference by the user, any pin_count +* remains, it must be due to another use by the display engine. +*/ + return obj-pin_count - !!obj-user_pin_count; +} + /* * Prepare buffer for display plane (scanout, cursors, etc). * Can be called from an uninterruptible phase (modesetting) and allows @@ -3602,6 +3618,11 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, return ret; } + /* Mark the pin_display early so that we account for the +* display coherency whilst setting up the cache domains. +*/ + obj-pin_display = true; + /* The display engine is not coherent with the LLC cache on gen6. As * a result, we make sure that the pinning that is about to occur is * done with uncached PTEs. This is lowest common denominator for all @@ -3613,7 +3634,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, */ ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); if (ret) - return ret; + goto err_unpin_display; /* As the user may map the buffer once pinned in the display plane * (e.g. libkms for the bootup splash), we have to ensure that we @@ -3621,7 +3642,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, */ ret = i915_gem_obj_ggtt_pin(obj, alignment, true, false); if (ret) - return ret; + goto err_unpin_display; i915_gem_object_flush_cpu_write_domain(obj); @@ -3639,6 +3660,17 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, old_write_domain); return 0; + +err_unpin_display: + obj-pin_display = is_pin_display(obj); + return ret; +} + +void +i915_gem_object_unpin_from_display_plane(struct
[Intel-gfx] [PATCH] drm/i915: Update rules for writing through the LLC with the cpu
As mentioned in the previous commit, reads and writes from both the CPU and GPU go through the LLC. This gives us coherency between the CPU and GPU irrespective of the attribute settings either device sets. We can use to avoid having to clflush even uncached memory. Except for the scanout. The scanout resides within another functional block that does not use the LLC but reads directly from main memory. So in order to maintain coherency with the scanout, writes to uncached memory must be flushed. In order to optimize writes elsewhere, we start tracking whether an framebuffer is attached to an object. v2: Use pin_display tracking rather than fb_count (to ensure we flush cursors as well etc) and only force the clflush along explicit writes to the scanout paths (i.e. pin_to_display_plane and pwrite into scanout). v3: Force the flush after hitting the slowpath in pwrite, as after dropping the lock the object's cache domain may be invalidated. (Ville) Based on a patch by Ville Syrjälä. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_gem.c| 58 -- drivers/gpu/drm/i915/i915_gem_exec.c | 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c| 2 +- 5 files changed, 36 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a6a1cc3..48a5463 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1860,7 +1860,7 @@ static inline bool i915_terminally_wedged(struct i915_gpu_error *error) } void i915_gem_reset(struct drm_device *dev); -void i915_gem_clflush_object(struct drm_i915_gem_object *obj); +void i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force); int __must_check i915_gem_object_set_tiling(struct drm_i915_gem_object *obj, int tiling_mode, int pitch); int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3880f05..650e0ac 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -38,7 +38,8 @@ #include linux/dma-buf.h static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); -static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); +static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj, + bool force); static __must_check int i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, struct i915_address_space *vm, @@ -68,6 +69,14 @@ static bool cpu_cache_is_coherent(struct drm_device *dev, return HAS_LLC(dev) || level != I915_CACHE_NONE; } +static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) +{ + if (!cpu_cache_is_coherent(obj-base.dev, obj-cache_level)) + return true; + + return obj-pin_display; +} + static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj) { if (obj-tiling_mode) @@ -830,8 +839,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, * write domain and manually flush cachelines (if required). This * optimizes for the case when the gpu will use the data * right away and we therefore have to clflush anyway. */ - if (obj-cache_level == I915_CACHE_NONE) - needs_clflush_after = 1; + needs_clflush_after = cpu_write_needs_clflush(obj); if (i915_gem_obj_bound_any(obj)) { ret = i915_gem_object_set_to_gtt_domain(obj, true); if (ret) @@ -921,7 +929,7 @@ out: */ if (!needs_clflush_after obj-base.write_domain != I915_GEM_DOMAIN_CPU) { - i915_gem_clflush_object(obj); + i915_gem_clflush_object(obj, obj-pin_display); i915_gem_chipset_flush(dev); } } @@ -999,9 +1007,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, goto out; } - if (obj-cache_level == I915_CACHE_NONE - obj-tiling_mode == I915_TILING_NONE - obj-base.write_domain != I915_GEM_DOMAIN_CPU) { + if (obj-tiling_mode == I915_TILING_NONE + obj-base.write_domain != I915_GEM_DOMAIN_CPU + cpu_write_needs_clflush(obj)) { ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file); /* Note that the gtt paths might fail with non-page-backed user * pointers (e.g. gtt mappings when moving data between @@ -1350,8 +1358,8 @@ i915_gem_sw_finish_ioctl(struct
Re: [Intel-gfx] [PATCH] drm/i915: Track when an object is pinned for use by the display engine
On Fri, Aug 09, 2013 at 12:25:09PM +0100, Chris Wilson wrote: The display engine has unique coherency rules such that it requires special handling to ensure that all writes to cursors, scanouts and sprites are clflushed. This patch introduces the infrastructure to simply track when an object is being accessed by the display engine. v2: Explain the is_pin_display() magic as the sources for obj-pin_count and their individual rules is not obvious. (Ville) Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 36 ++-- drivers/gpu/drm/i915/intel_display.c | 8 4 files changed, 42 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b06e6ce..463c660 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -117,6 +117,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, (name: %d), obj-base.name); if (obj-pin_count) seq_printf(m, (pinned x %d), obj-pin_count); + if (obj-pin_display) + seq_printf(m, (display)); if (obj-fence_reg != I915_FENCE_REG_NONE) seq_printf(m, (fence: %d), obj-fence_reg); list_for_each_entry(vma, obj-vma_list, vma_link) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 957f22e..a6a1cc3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1379,6 +1379,7 @@ struct drm_i915_gem_object { */ unsigned int fault_mappable:1; unsigned int pin_mappable:1; + unsigned int pin_display:1; /* * Is the GPU currently using a fence to access this buffer, @@ -1888,6 +1889,7 @@ int __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 alignment, struct intel_ring_buffer *pipelined); +void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj); int i915_gem_attach_phys_object(struct drm_device *dev, struct drm_i915_gem_object *obj, int id, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 065f927..3880f05 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3583,6 +3583,22 @@ unlock: return ret; } +static bool is_pin_display(struct drm_i915_gem_object *obj) +{ + /* There are 3 sources that pin objects: + * 1. The display engine (scanouts, sprites, cursors); + * 2. Reservations for execbuffer; + * 3. The user. + * + * We can ignore reservations as we hold the struct_mutex and + * are only called outside of the reservation path. The user + * can only increment pin_count once, and so if after + * subtracting the potential reference by the user, any pin_count + * remains, it must be due to another use by the display engine. + */ + return obj-pin_count - !!obj-user_pin_count; +} + /* * Prepare buffer for display plane (scanout, cursors, etc). * Can be called from an uninterruptible phase (modesetting) and allows @@ -3602,6 +3618,11 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, return ret; } + /* Mark the pin_display early so that we account for the + * display coherency whilst setting up the cache domains. + */ + obj-pin_display = true; + /* The display engine is not coherent with the LLC cache on gen6. As * a result, we make sure that the pinning that is about to occur is * done with uncached PTEs. This is lowest common denominator for all @@ -3613,7 +3634,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, */ ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); if (ret) - return ret; + goto err_unpin_display; /* As the user may map the buffer once pinned in the display plane * (e.g. libkms for the bootup splash), we have to ensure that we @@ -3621,7 +3642,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, */ ret = i915_gem_obj_ggtt_pin(obj, alignment, true, false); if (ret) - return ret; + goto err_unpin_display; i915_gem_object_flush_cpu_write_domain(obj); @@ -3639,6 +3660,17 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, old_write_domain); return 0; + +err_unpin_display: +
Re: [Intel-gfx] [PATCH] drm/i915: Update rules for writing through the LLC with the cpu
On Fri, Aug 09, 2013 at 12:26:45PM +0100, Chris Wilson wrote: As mentioned in the previous commit, reads and writes from both the CPU and GPU go through the LLC. This gives us coherency between the CPU and GPU irrespective of the attribute settings either device sets. We can use to avoid having to clflush even uncached memory. Except for the scanout. The scanout resides within another functional block that does not use the LLC but reads directly from main memory. So in order to maintain coherency with the scanout, writes to uncached memory must be flushed. In order to optimize writes elsewhere, we start tracking whether an framebuffer is attached to an object. v2: Use pin_display tracking rather than fb_count (to ensure we flush cursors as well etc) and only force the clflush along explicit writes to the scanout paths (i.e. pin_to_display_plane and pwrite into scanout). v3: Force the flush after hitting the slowpath in pwrite, as after dropping the lock the object's cache domain may be invalidated. (Ville) Based on a patch by Ville Syrjälä. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ville Syrjälä ville.syrj...@linux.intel.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Second HDMI port not visible
On Wed, Aug 07, 2013 at 01:29:53PM -0700, Jesse Barnes wrote: Chris's machine would be a good regression test for this. If it works for him too, I think we should push it. Well, the good news is that it adds another HDMI connection. The bad news is that the hardware refuses to acknowledge my connections anyway so I have no idea if it is now broken. Doesn't look for any reason like it should spontaneously break, so lgtm. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
A simple modeset, where we only wish to switch over to a new framebuffer such as the transition from fbcon to X, takes around 30-60ms. This is due to three factors: 1. We need to make sure the fb-obj is in the display domain, which incurs a cache flush to ensure no dirt is left on the scanout. 2. We need to flush any pending rendering before performing the mmio so that the frame is complete before it is shown. 3. We currently wait for the vblank after the mmio to be sure that the old fb is no longer being shown before releasing it. (1) can only be eliminated by userspace preparing the fb-obj in advance to already be in the display domain. This can be done through use of the create2 ioctl, or by reusing an existing fb-obj. However, (2) and (3) are already solved by the existing page flip mechanism, and it is surprisingly trivial to wire them up for use in the set-base fast path. Though it can be argued that this represents a subtle ABI break in that the set_config ioctl now returns before the old framebuffer is unpinned. The danger is that userspace will start to modify it before it is no longer being shown, however we should be able to prevent that through proper domain tracking. By combining all of the above, we can achieve an instaneous set_config: [ 6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal [ 6.601] (II) intel(0): Setting screen physical size to 677 x 381 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_display.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 809b968..c6eea51 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set) ret = intel_set_mode(set-crtc, set-mode, set-x, set-y, set-fb); } else if (config-fb_changed) { - intel_crtc_wait_for_pending_flips(set-crtc); - - ret = intel_pipe_set_base(set-crtc, - set-x, set-y, set-fb); + if (to_intel_framebuffer(set-fb)-obj-ring == NULL || + save_set.x != set-x || save_set.y != set-y || + intel_crtc_page_flip(set-crtc, set-fb, NULL)) { + intel_crtc_wait_for_pending_flips(set-crtc); + ret = intel_pipe_set_base(set-crtc, + set-x, set-y, set-fb); + } } if (ret) { -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix FB WM for HSW
From: Ville Syrjälä ville.syrj...@linux.intel.com Due to a misplaced memset(), we never actually enabled the FBC WM on HSW. Move the memset() to happen a bit earlier, so that it won't clobber results-enable_fbc_wm. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/intel_pm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index b0e4a0b..862e350 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2429,6 +2429,8 @@ static void hsw_compute_wm_results(struct drm_device *dev, break; max_level = level - 1; + memset(results, 0, sizeof(*results)); + /* The spec says it is preferred to disable FBC WMs instead of disabling * a WM level. */ results-enable_fbc_wm = true; @@ -2439,7 +2441,6 @@ static void hsw_compute_wm_results(struct drm_device *dev, } } - memset(results, 0, sizeof(*results)); for (wm_lp = 1; wm_lp = 3; wm_lp++) { const struct hsw_lp_wm_result *r; -- 1.8.1.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [patch v2] drm/i915: fix a limit check in hsw_compute_wm_results()
On Fri, Aug 09, 2013 at 01:29:33PM +0300, Ville Syrjälä wrote: On Fri, Aug 09, 2013 at 01:07:31PM +0300, Dan Carpenter wrote: The '!' here was not intended. Since '!' has higher precedence than compare, it means the check is never true. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com The culprit for Daniel: commit 71fff20ff1bb790f4defe0c880e028581ffab420 Author: Ville Syrjälä ville.syrj...@linux.intel.com Date: Tue Aug 6 22:24:03 2013 +0300 drm/i915: Kill fbc_enable from hsw_lp_wm_results Queued for -next, thanks for the patch. -Daniel --- v2: My first patch was wrong. diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 96234c6..0f5eb21 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2685,7 +2685,7 @@ static void hsw_compute_wm_results(struct drm_device *dev, * a WM level. */ results-enable_fbc_wm = true; for (level = 1; level = max_level; level++) { - if (!lp_results[level - 1].fbc_val lp_maximums-fbc) { + if (lp_results[level - 1].fbc_val lp_maximums-fbc) { results-enable_fbc_wm = false; lp_results[level - 1].fbc_val = 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: expose HDMI connectors on port C on BYT
Ryan noticed that on his board, HDMI was wired up to port C but not exposed by the kernel, which had only expected DP on that port. Fix that up by enumerating both ports if possible. Tested-by: Matsumura, Ryan ryan.matsum...@intel.com Acked-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_display.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a0914db..fe37908 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9343,8 +9343,13 @@ static void intel_setup_outputs(struct drm_device *dev) intel_dp_init(dev, PCH_DP_D, PORT_D); } else if (IS_VALLEYVIEW(dev)) { /* Check for built-in panel first. Shares lanes with HDMI on SDVOC */ - if (I915_READ(VLV_DISPLAY_BASE + DP_C) DP_DETECTED) - intel_dp_init(dev, VLV_DISPLAY_BASE + DP_C, PORT_C); + if (I915_READ(VLV_DISPLAY_BASE + GEN4_HDMIC) SDVO_DETECTED) { + intel_hdmi_init(dev, VLV_DISPLAY_BASE + GEN4_HDMIC, + PORT_C); + if (I915_READ(VLV_DISPLAY_BASE + DP_C) DP_DETECTED) + intel_dp_init(dev, VLV_DISPLAY_BASE + DP_C, + PORT_C); + } if (I915_READ(VLV_DISPLAY_BASE + GEN4_HDMIB) SDVO_DETECTED) { intel_hdmi_init(dev, VLV_DISPLAY_BASE + GEN4_HDMIB, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: expose HDMI connectors on port C on BYT
On Fri, Aug 09, 2013 at 09:34:35AM -0700, Jesse Barnes wrote: Ryan noticed that on his board, HDMI was wired up to port C but not exposed by the kernel, which had only expected DP on that port. Fix that up by enumerating both ports if possible. Tested-by: Matsumura, Ryan ryan.matsum...@intel.com Acked-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_display.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a0914db..fe37908 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9343,8 +9343,13 @@ static void intel_setup_outputs(struct drm_device *dev) intel_dp_init(dev, PCH_DP_D, PORT_D); } else if (IS_VALLEYVIEW(dev)) { /* Check for built-in panel first. Shares lanes with HDMI on SDVOC */ - if (I915_READ(VLV_DISPLAY_BASE + DP_C) DP_DETECTED) - intel_dp_init(dev, VLV_DISPLAY_BASE + DP_C, PORT_C); + if (I915_READ(VLV_DISPLAY_BASE + GEN4_HDMIC) SDVO_DETECTED) { + intel_hdmi_init(dev, VLV_DISPLAY_BASE + GEN4_HDMIC, + PORT_C); + if (I915_READ(VLV_DISPLAY_BASE + DP_C) DP_DETECTED) + intel_dp_init(dev, VLV_DISPLAY_BASE + DP_C, + PORT_C); + } Spaces instead of tabs in the above hunk. Tsk, tsk ... Fixed while applying. -Daniel if (I915_READ(VLV_DISPLAY_BASE + GEN4_HDMIB) SDVO_DETECTED) { intel_hdmi_init(dev, VLV_DISPLAY_BASE + GEN4_HDMIB, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix FB WM for HSW
2013/8/9 ville.syrj...@linux.intel.com: From: Ville Syrjälä ville.syrj...@linux.intel.com Due to a misplaced memset(), we never actually enabled the FBC WM on HSW. Move the memset() to happen a bit earlier, so that it won't clobber results-enable_fbc_wm. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Yay! Thanks! Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/intel_pm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index b0e4a0b..862e350 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2429,6 +2429,8 @@ static void hsw_compute_wm_results(struct drm_device *dev, break; max_level = level - 1; + memset(results, 0, sizeof(*results)); + /* The spec says it is preferred to disable FBC WMs instead of disabling * a WM level. */ results-enable_fbc_wm = true; @@ -2439,7 +2441,6 @@ static void hsw_compute_wm_results(struct drm_device *dev, } } - memset(results, 0, sizeof(*results)); for (wm_lp = 1; wm_lp = 3; wm_lp++) { const struct hsw_lp_wm_result *r; -- 1.8.1.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Updated drm-intel-testing
Hi all, New testing cycle, new features: - Cleanup of the old crtc helper callbacks, all encoders are now converted to the i915 modeset infrastructure. - Massive amount of wm patches from Ville for ilk, snb, ivb, hsw, this is prep work to eventually get things going for nuclear pageflips where we need to adjust watermarks on the fly. - More vm/vma patches from Ben. This refactoring isn't yet fully rolled out, we miss the execbuf conversion and some of the low-level bind/unbind support code. - Convert our hdmi infoframe code to use the new common helper functions (Damien). This contains some bugfixes for the common infoframe helpers. - Some cruft removal from Damien. - Various smaller bitspieces all over, as usual. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote: A simple modeset, where we only wish to switch over to a new framebuffer such as the transition from fbcon to X, takes around 30-60ms. This is due to three factors: 1. We need to make sure the fb-obj is in the display domain, which incurs a cache flush to ensure no dirt is left on the scanout. 2. We need to flush any pending rendering before performing the mmio so that the frame is complete before it is shown. 3. We currently wait for the vblank after the mmio to be sure that the old fb is no longer being shown before releasing it. (1) can only be eliminated by userspace preparing the fb-obj in advance to already be in the display domain. This can be done through use of the create2 ioctl, or by reusing an existing fb-obj. However, (2) and (3) are already solved by the existing page flip mechanism, and it is surprisingly trivial to wire them up for use in the set-base fast path. Though it can be argued that this represents a subtle ABI break in that the set_config ioctl now returns before the old framebuffer is unpinned. The danger is that userspace will start to modify it before it is no longer being shown, however we should be able to prevent that through proper domain tracking. Hm, right now we don't prevent anyone from rendering into a to-be-flipped out buffer. There was once code in it, using MI_WAIT_EVENT but we've ripped it out. I guess we could just throw in a synchronous stall on the flip queue though, that should work always. Testing would be easy if we have the crtc CRC stuff, but that's atm stuck due to lack of volunteers ... Overall I really like the idea and I think doing most of the plane enabling (including psr, fbc, ips, and all that stuff which potentially blows through a wblank wait) should be done in async work queues. That should then also help resume time a lot. Cheers, Daniel By combining all of the above, we can achieve an instaneous set_config: [ 6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal [ 6.601] (II) intel(0): Setting screen physical size to 677 x 381 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_display.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 809b968..c6eea51 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9077,10 +9077,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set) ret = intel_set_mode(set-crtc, set-mode, set-x, set-y, set-fb); } else if (config-fb_changed) { - intel_crtc_wait_for_pending_flips(set-crtc); - - ret = intel_pipe_set_base(set-crtc, - set-x, set-y, set-fb); + if (to_intel_framebuffer(set-fb)-obj-ring == NULL || + save_set.x != set-x || save_set.y != set-y || + intel_crtc_page_flip(set-crtc, set-fb, NULL)) { + intel_crtc_wait_for_pending_flips(set-crtc); + ret = intel_pipe_set_base(set-crtc, + set-x, set-y, set-fb); + } } if (ret) { -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6.1/9] drm/i915: don't queue PM events we won't process
From: Paulo Zanoni paulo.r.zan...@intel.com On SNB/IVB/VLV we only call gen6_rps_irq_handler if one of the IIR bits set is part of GEN6_PM_RPS_EVENTS, but at gen6_rps_irq_handler we add all the enabled IIR bits to the work queue, not only the ones that are part of GEN6_PM_RPS_EVENTS. But then gen6_pm_rps_work only processes GEN6_PM_RPS_EVENTS, so it's useless to add anything that's not GEN6_PM_RPS_EVENTS to the work queue. As a bonus, gen6_rps_irq_handler looks more similar to hsw_pm_irq_handler, so we may be able to merge them in the future. Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 0f46d33..5b51c43 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -959,7 +959,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, */ spin_lock(dev_priv-irq_lock); - dev_priv-rps.pm_iir |= pm_iir; + dev_priv-rps.pm_iir |= pm_iir GEN6_PM_RPS_EVENTS; snb_set_pm_irq(dev_priv, dev_priv-rps.pm_iir); spin_unlock(dev_priv-irq_lock); @@ -1128,7 +1128,7 @@ static irqreturn_t valleyview_irq_handler(int irq, void *arg) if (pipe_stats[0] PIPE_GMBUS_INTERRUPT_STATUS) gmbus_irq_handler(dev); - if (pm_iir GEN6_PM_RPS_EVENTS) + if (pm_iir) gen6_rps_irq_handler(dev_priv, pm_iir); I915_WRITE(GTIIR, gt_iir); @@ -1433,7 +1433,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) if (pm_iir) { if (IS_HASWELL(dev)) hsw_pm_irq_handler(dev_priv, pm_iir); - else if (pm_iir GEN6_PM_RPS_EVENTS) + else gen6_rps_irq_handler(dev_priv, pm_iir); I915_WRITE(GEN6_PMIIR, pm_iir); ret = IRQ_HANDLED; -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6.3/9] drm/i915: merge HSW and SNB PM irq handlers
From: Paulo Zanoni paulo.r.zan...@intel.com Because hsw_pm_irq_handler does exactly what gen6_rps_irq_handler does and also processes the 2 additional VEBOX bits. So merge those functions and wrap the VEBOX bits on an IS_HASWELL check. This HSW check isn't really necessary since the bits are reserved on SNB/IVB/VLV, but it's a good documentation on who uses them. Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 50 ++--- 1 file changed, 12 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index d9ebfb6..8ba5d0a 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -939,28 +939,6 @@ static void snb_gt_irq_handler(struct drm_device *dev, ivybridge_parity_error_irq_handler(dev); } -/* Legacy way of handling PM interrupts */ -static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, -u32 pm_iir) -{ - /* -* IIR bits should never already be set because IMR should -* prevent an interrupt from being shown in IIR. The warning -* displays a case where we've unsafely cleared -* dev_priv-rps.pm_iir. Although missing an interrupt of the same -* type is not a problem, it displays a problem in the logic. -* -* The mask bit in IMR is cleared by dev_priv-rps.work. -*/ - - spin_lock(dev_priv-irq_lock); - dev_priv-rps.pm_iir |= pm_iir GEN6_PM_RPS_EVENTS; - snb_disable_pm_irq(dev_priv, pm_iir GEN6_PM_RPS_EVENTS); - spin_unlock(dev_priv-irq_lock); - - queue_work(dev_priv-wq, dev_priv-rps.work); -} - #define HPD_STORM_DETECT_PERIOD 1000 #define HPD_STORM_THRESHOLD 5 @@ -1027,13 +1005,10 @@ static void dp_aux_irq_handler(struct drm_device *dev) wake_up_all(dev_priv-gmbus_wait_queue); } -/* Unlike gen6_rps_irq_handler() from which this function is originally derived, - * we must be able to deal with other PM interrupts. This is complicated because - * of the way in which we use the masks to defer the RPS work (which for - * posterity is necessary because of forcewake). - */ -static void hsw_pm_irq_handler(struct drm_i915_private *dev_priv, - u32 pm_iir) +/* The RPS events need forcewake, so we add them to a work queue and mask their + * IMR bits until the work is done. Other interrupts can be processed without + * the work queue. */ +static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir) { if (pm_iir GEN6_PM_RPS_EVENTS) { spin_lock(dev_priv-irq_lock); @@ -1044,12 +1019,14 @@ static void hsw_pm_irq_handler(struct drm_i915_private *dev_priv, queue_work(dev_priv-wq, dev_priv-rps.work); } - if (pm_iir PM_VEBOX_USER_INTERRUPT) - notify_ring(dev_priv-dev, dev_priv-ring[VECS]); + if (IS_HASWELL(dev_priv-dev)) { + if (pm_iir PM_VEBOX_USER_INTERRUPT) + notify_ring(dev_priv-dev, dev_priv-ring[VECS]); - if (pm_iir PM_VEBOX_CS_ERROR_INTERRUPT) { - DRM_ERROR(VEBOX CS error interrupt 0x%08x\n, pm_iir); - i915_handle_error(dev_priv-dev, false); + if (pm_iir PM_VEBOX_CS_ERROR_INTERRUPT) { + DRM_ERROR(VEBOX CS error interrupt 0x%08x\n, pm_iir); + i915_handle_error(dev_priv-dev, false); + } } } @@ -1424,10 +1401,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) if (INTEL_INFO(dev)-gen = 6) { u32 pm_iir = I915_READ(GEN6_PMIIR); if (pm_iir) { - if (IS_HASWELL(dev)) - hsw_pm_irq_handler(dev_priv, pm_iir); - else - gen6_rps_irq_handler(dev_priv, pm_iir); + gen6_rps_irq_handler(dev_priv, pm_iir); I915_WRITE(GEN6_PMIIR, pm_iir); ret = IRQ_HANDLED; } -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6.2/9] drm/i915: fix how we mask PMIMR when adding work to the queue
From: Paulo Zanoni paulo.r.zan...@intel.com It seems we've been doing this ever since we started processing the RPS events on a work queue, on commit drm/i915: move gen6 rps handling to workqueue, 4912d04193733a825216b926ffd290fada88ab07. The problem is: when we add work to the queue, instead of just masking the bits we queued and leaving all the others on their current state, we mask the bits we queued and unmask all the others. This basically means we'll be unmasking a bunch of interrupts we're not going to process. And if you look at gen6_pm_rps_work, we unmask back only GEN6_PM_RPS_EVENTS, which means the bits we unmasked when adding work to the queue will remain unmasked after we process the queue. Notice that even though we unmask those unrelated interrupts, we never enable them on IER, so they don't fire our interrupt handler, they just stay there on IIR waiting to be cleared when something else triggers the interrupt handler. So this patch does what seems to make more sense: mask only the bits we add to the queue, without unmasking anything else, and so we'll unmask them after we process the queue. As a side effect we also have to remove that WARN, because it is not only making sure we don't mask useful interrupts, it is also making sure we do unmask useless interrupts! That piece of code should not be responsible for knowing which bits should be unmasked, so just don't assert anything, and trust that snb_disable_pm_irq should be doing the right thing. With i915.enable_pc8=1 I was getting ocasional GEN6_PMIIR is not 0 error messages due to the fact that we unmask those unrelated interrupts but don't enable them. Note: if bugs start bisecting to this patch, then it probably means someone was relying on the fact that we unmask everything by accident, then we should fix gen5_gt_irq_postinstall or whoever needs the accidentally unmasked interrupts. Or maybe I was just wrong and we need to revert this patch :) Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5b51c43..d9ebfb6 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -167,11 +167,6 @@ void snb_disable_pm_irq(struct drm_i915_private *dev_priv, uint32_t mask) snb_update_pm_irq(dev_priv, mask, 0); } -static void snb_set_pm_irq(struct drm_i915_private *dev_priv, uint32_t val) -{ - snb_update_pm_irq(dev_priv, 0x, ~val); -} - static bool ivb_can_enable_err_int(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -960,7 +955,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, spin_lock(dev_priv-irq_lock); dev_priv-rps.pm_iir |= pm_iir GEN6_PM_RPS_EVENTS; - snb_set_pm_irq(dev_priv, dev_priv-rps.pm_iir); + snb_disable_pm_irq(dev_priv, pm_iir GEN6_PM_RPS_EVENTS); spin_unlock(dev_priv-irq_lock); queue_work(dev_priv-wq, dev_priv-rps.work); @@ -1043,9 +1038,7 @@ static void hsw_pm_irq_handler(struct drm_i915_private *dev_priv, if (pm_iir GEN6_PM_RPS_EVENTS) { spin_lock(dev_priv-irq_lock); dev_priv-rps.pm_iir |= pm_iir GEN6_PM_RPS_EVENTS; - snb_set_pm_irq(dev_priv, dev_priv-rps.pm_iir); - /* never want to mask useful interrupts. */ - WARN_ON(I915_READ_NOTRACE(GEN6_PMIMR) ~GEN6_PM_RPS_EVENTS); + snb_disable_pm_irq(dev_priv, pm_iir GEN6_PM_RPS_EVENTS); spin_unlock(dev_priv-irq_lock); queue_work(dev_priv-wq, dev_priv-rps.work); -- 1.8.1.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Asynchronously perform the set-base for a simple modeset
On Fri, Aug 09, 2013 at 09:17:11PM +0200, Daniel Vetter wrote: On Fri, Aug 09, 2013 at 03:13:22PM +0100, Chris Wilson wrote: A simple modeset, where we only wish to switch over to a new framebuffer such as the transition from fbcon to X, takes around 30-60ms. This is due to three factors: 1. We need to make sure the fb-obj is in the display domain, which incurs a cache flush to ensure no dirt is left on the scanout. 2. We need to flush any pending rendering before performing the mmio so that the frame is complete before it is shown. 3. We currently wait for the vblank after the mmio to be sure that the old fb is no longer being shown before releasing it. (1) can only be eliminated by userspace preparing the fb-obj in advance to already be in the display domain. This can be done through use of the create2 ioctl, or by reusing an existing fb-obj. However, (2) and (3) are already solved by the existing page flip mechanism, and it is surprisingly trivial to wire them up for use in the set-base fast path. Though it can be argued that this represents a subtle ABI break in that the set_config ioctl now returns before the old framebuffer is unpinned. The danger is that userspace will start to modify it before it is no longer being shown, however we should be able to prevent that through proper domain tracking. Hm, right now we don't prevent anyone from rendering into a to-be-flipped out buffer. There was once code in it, using MI_WAIT_EVENT but we've ripped it out. I guess we could just throw in a synchronous stall on the flip queue though, that should work always. I'm glad we did. I'd rather put that into userspace rather than have the kernel impose that policy on everybody, as for X that is exactly the behaviour we want (i.e. not blocking rendering on the next scanout). Testing would be easy if we have the crtc CRC stuff, but that's atm stuck due to lack of volunteers ... Overall I really like the idea and I think doing most of the plane enabling (including psr, fbc, ips, and all that stuff which potentially blows through a wblank wait) should be done in async work queues. That should then also help resume time a lot. I'd also like to hear Ville's opinion since with his atomic modesetting I hope we will be able to achieve something very similar. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/9] drm/i915: allow package C8+ states on Haswell (disabled)
From: Paulo Zanoni paulo.r.zan...@intel.com This patch allows PC8+ states on Haswell. These states can only be reached when all the display outputs are disabled, and they allow some more power savings. The fact that the graphics device is allowing PC8+ doesn't mean that the machine will actually enter PC8+: all the other devices also need to allow PC8+. For now this option is disabled by default. You need i915.allow_pc8=1 if you want it. This patch adds a big comment inside i915_drv.h explaining how it works and how it tracks things. Read it. v2: (this is not really v2, many previous versions were already sent, but they had different names) - Use the new functions to enable/disable GTIMR and GEN6_PMIMR - Rename almost all variables and functions to names suggested by Chris - More WARNs on the IRQ handling code - Also disable PC8 when there's GPU work to do (thanks to Ben for the help on this), so apps can run caster - Enable PC8 on a delayed work function that is delayed for 5 seconds. This makes sure we only enable PC8+ if we're really idle - Make sure we're not in PC8+ when suspending v3: - WARN if IRQs are disabled on __wait_seqno - Replace some DRM_ERRORs with WARNs - Fix calls to restore GT and PM interrupts - Use intel_mark_busy instead of intel_ring_advance to disable PC8 v4: - Use the force_wake, Luke! Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/i915_dma.c | 10 ++ drivers/gpu/drm/i915/i915_drv.c | 11 +++ drivers/gpu/drm/i915/i915_drv.h | 72 +++ drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_irq.c | 106 + drivers/gpu/drm/i915/intel_display.c | 174 ++- drivers/gpu/drm/i915/intel_dp.c | 3 + drivers/gpu/drm/i915/intel_drv.h | 7 ++ drivers/gpu/drm/i915/intel_i2c.c | 2 + drivers/gpu/drm/i915/intel_pm.c | 11 +++ 10 files changed, 397 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 0adfe40..9fc22f9 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1483,6 +1483,14 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) mutex_init(dev_priv-rps.hw_lock); mutex_init(dev_priv-modeset_restore_lock); + mutex_init(dev_priv-pc8.lock); + dev_priv-pc8.requirements_met = false; + dev_priv-pc8.gpu_idle = false; + dev_priv-pc8.irqs_disabled = false; + dev_priv-pc8.enabled = false; + dev_priv-pc8.disable_count = 2; /* requirements_met + gpu_idle */ + INIT_DELAYED_WORK(dev_priv-pc8.enable_work, hsw_enable_pc8_work); + i915_dump_device_info(dev_priv); if (i915_get_bridge_dev(dev)) { @@ -1724,6 +1732,8 @@ int i915_driver_unload(struct drm_device *dev) cancel_work_sync(dev_priv-gpu_error.work); i915_destroy_error_state(dev); + cancel_delayed_work_sync(dev_priv-pc8.enable_work); + if (dev-pdev-msi_enabled) pci_disable_msi(dev-pdev); diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 13457e3e..6169c92 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -141,6 +141,10 @@ module_param_named(fastboot, i915_fastboot, bool, 0600); MODULE_PARM_DESC(fastboot, Try to skip unnecessary mode sets at boot time (default: false)); +int i915_enable_pc8 __read_mostly = 0; +module_param_named(enable_pc8, i915_enable_pc8, int, 0600); +MODULE_PARM_DESC(enable_pc8, Enable support for low power package C states (PC8+) (default: false)); + bool i915_prefault_disable __read_mostly; module_param_named(prefault_disable, i915_prefault_disable, bool, 0600); MODULE_PARM_DESC(prefault_disable, @@ -557,6 +561,9 @@ static int i915_drm_freeze(struct drm_device *dev) dev_priv-modeset_restore = MODESET_SUSPENDED; mutex_unlock(dev_priv-modeset_restore_lock); + /* We do a lot of poking in a lot of registers, make sure they work +* properly. */ + hsw_disable_package_c8(dev_priv); intel_set_power_well(dev, true); drm_kms_helper_poll_disable(dev); @@ -713,6 +720,10 @@ static int __i915_drm_thaw(struct drm_device *dev) schedule_work(dev_priv-console_resume_work); } + /* Undo what we did at i915_drm_freeze so the refcount goes back to the +* expected level. */ + hsw_enable_package_c8(dev_priv); + mutex_lock(dev_priv-modeset_restore_lock); dev_priv-modeset_restore = MODESET_DONE; mutex_unlock(dev_priv-modeset_restore_lock); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2c7c098..fbcb0ca 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1071,6 +1071,75 @@ struct intel_wm_level {
Re: [Intel-gfx] [PATCH 7/9] drm/i915: allow package C8+ states on Haswell (disabled)
Quick note... On Fri, Aug 09, 2013 at 05:10:05PM -0300, Paulo Zanoni wrote: + WARN_ON(!mutex_is_locked(dev_priv-pc8.lock)); Preferred form is now lockdep_assert_held(dev_priv-pc8.lock); -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: allow package C8+ states on Haswell (disabled)
On Fri, Aug 09, 2013 at 05:10:05PM -0300, Paulo Zanoni wrote: +/* Disable interrupts so we can allow Package C8+. */ +void hsw_pc8_disable_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-irq_lock, irqflags); + + dev_priv-pc8.regsave.deimr = I915_READ(DEIMR); + dev_priv-pc8.regsave.sdeimr = I915_READ(SDEIMR); + dev_priv-pc8.regsave.gtimr = I915_READ(GTIMR); + dev_priv-pc8.regsave.gtier = I915_READ(GTIER); + dev_priv-pc8.regsave.gen6_pmimr = I915_READ(GEN6_PMIMR); + + ironlake_disable_display_irq(dev_priv, ~DE_PCH_EVENT_IVB); + ibx_disable_display_interrupt(dev_priv, ~SDE_HOTPLUG_MASK_CPT); + ilk_disable_gt_irq(dev_priv, 0x); + snb_disable_pm_irq(dev_priv, 0x); + + dev_priv-pc8.irqs_disabled = true; + + WARN(I915_READ(DEIIR), DEIIR is not 0\n); + WARN(I915_READ(SDEIIR), SDEIIR is not 0\n); + WARN(I915_READ(GTIIR), GTIIR is not 0\n); + WARN(I915_READ(GEN6_PMIIR), GEN6_PMIIR is not 0\n); I keep looking at this, because we will hit these warns. But I also don't think it a problem, as the interrupt handle will run as soon as we release the irq_lock and that will be the final time until we reenable interrupts. (Just kill the WARNs.) Now, why IMR and not IER? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: allow package C8+ states on Haswell (disabled)
2013/8/9 Chris Wilson ch...@chris-wilson.co.uk: On Fri, Aug 09, 2013 at 05:10:05PM -0300, Paulo Zanoni wrote: +/* Disable interrupts so we can allow Package C8+. */ +void hsw_pc8_disable_interrupts(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + unsigned long irqflags; + + spin_lock_irqsave(dev_priv-irq_lock, irqflags); + + dev_priv-pc8.regsave.deimr = I915_READ(DEIMR); + dev_priv-pc8.regsave.sdeimr = I915_READ(SDEIMR); + dev_priv-pc8.regsave.gtimr = I915_READ(GTIMR); + dev_priv-pc8.regsave.gtier = I915_READ(GTIER); + dev_priv-pc8.regsave.gen6_pmimr = I915_READ(GEN6_PMIMR); + + ironlake_disable_display_irq(dev_priv, ~DE_PCH_EVENT_IVB); + ibx_disable_display_interrupt(dev_priv, ~SDE_HOTPLUG_MASK_CPT); + ilk_disable_gt_irq(dev_priv, 0x); + snb_disable_pm_irq(dev_priv, 0x); + + dev_priv-pc8.irqs_disabled = true; + + WARN(I915_READ(DEIIR), DEIIR is not 0\n); + WARN(I915_READ(SDEIIR), SDEIIR is not 0\n); + WARN(I915_READ(GTIIR), GTIIR is not 0\n); + WARN(I915_READ(GEN6_PMIIR), GEN6_PMIIR is not 0\n); I keep looking at this, because we will hit these warns. But I also don't think it a problem, as the interrupt handle will run as soon as we release the irq_lock and that will be the final time until we reenable interrupts. (Just kill the WARNs.) I also thought we were going to hit these WARNs, but I don't ever hit any of them in my current tree, even if I change the default PC8 timeout from 5s to 0.1s (which makes us enter/leave PC8+ very often), so they never bothered me enough to be killed. Maybe the race condition is too difficult to hit... But I can remove the WARNs. Now, why IMR and not IER? Because on our driver, when we want to enable/disable interrupts while everything is running we only use IMR, so our code is designed in a way that if you grab irq_lock and update IMR with the appropriate function you should be safe. On the other hand, some IER registers are updated inside the irq handlers (like DEIER and SDEIER at ironlake_irq_handler), so touching these is not as trivial. The only advantage of using IER would be to preserve the IIR bits, but according to Arthur the HW does not save/restore the IIR registers when entering/leaving PC8, so no advantage. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: allow package C8+ states on Haswell (disabled)
2013/8/9 Chris Wilson ch...@chris-wilson.co.uk: Quick note... On Fri, Aug 09, 2013 at 05:10:05PM -0300, Paulo Zanoni wrote: + WARN_ON(!mutex_is_locked(dev_priv-pc8.lock)); Preferred form is now lockdep_assert_held(dev_priv-pc8.lock); Should I also convert all our other usages of WARN_ON(!mutex_is_locked()) and BUG_ON(!mutex_is_locked()) too? On a separate patch, of course. We have currently no usage of lockdep_assert_held, and I like consistency, so fully switching to the preferred form is good IMHO. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Don't load context at driver init time on SNB
This is a partial revert of b4ae3f22d238617ca11610b29fde16cf8c0bc6e0 (drm/i915: load boot context at driver init time) This bit breaks hardware video decode for me after resume. Signed-off-by: Stéphane Marchesin marc...@chromium.org --- drivers/gpu/drm/i915/intel_pm.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index f895d15..ffa4ab2 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4614,10 +4614,6 @@ static void gen6_init_clock_gating(struct drm_device *dev) ILK_DPARBUNIT_CLOCK_GATE_ENABLE | ILK_DPFDUNIT_CLOCK_GATE_ENABLE); - /* WaMbcDriverBootEnable:snb */ - I915_WRITE(GEN6_MBCTL, I915_READ(GEN6_MBCTL) | - GEN6_MBCTL_ENABLE_BOOT_FETCH); - g4x_disable_trickle_feed(dev); /* The default value should be 0x200 according to docs, but the two -- 1.8.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't load context at driver init time on SNB
On Fri, Aug 09, 2013 at 08:32:54PM -0700, Stéphane Marchesin wrote: This is a partial revert of b4ae3f22d238617ca11610b29fde16cf8c0bc6e0 (drm/i915: load boot context at driver init time) This bit breaks hardware video decode for me after resume. Signed-off-by: Stéphane Marchesin marc...@chromium.org --- drivers/gpu/drm/i915/intel_pm.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index f895d15..ffa4ab2 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4614,10 +4614,6 @@ static void gen6_init_clock_gating(struct drm_device *dev) ILK_DPARBUNIT_CLOCK_GATE_ENABLE | ILK_DPFDUNIT_CLOCK_GATE_ENABLE); - /* WaMbcDriverBootEnable:snb */ - I915_WRITE(GEN6_MBCTL, I915_READ(GEN6_MBCTL) | -GEN6_MBCTL_ENABLE_BOOT_FETCH); - g4x_disable_trickle_feed(dev); /* The default value should be 0x200 according to docs, but the two I was looking at this a bit with Stéphane. One thing we both see is that the bit isn't sticking as it should. Clearly doing the write is having an effect, but something is seriously wrong. Writing the bit manually with IGT does make the bit stick. Stéphane, could you try to write the bit with IGT before and after resume, and see if it helps? Adding Jesse to the CC, since he wrote the original patch, maybe he has an idea why it doesn't stick. -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/5] drm: WARN when removing unallocated node
The conditional is usually a recoverable driver bug, and so WARNing, and preventing the drm_mm code from doing potential damage (BUG) is desirable. This issue was hit and fixed twice while developing the i915 multiple address space code. The first fix is the patch just before this, and is hit on an not frequently occuring error path. Another was fixed during patch iteration, so it's hard to see from the patch: commit c6cfb325677ea6305fb19acf3a4d14ea267f923e Author: Ben Widawsky b...@bwidawsk.net Date: Fri Jul 5 14:41:06 2013 -0700 drm/i915: Embed drm_mm_node in i915 gem obj From the intel-gfx mailing list, we discussed this: References: 20130705191235.ga3...@bwidawsk.net Cc: Dave Airlie airl...@redhat.com CC: dri-de...@lists.freedesktop.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/drm_mm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index aded1e1..af93cc5 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -254,6 +254,9 @@ void drm_mm_remove_node(struct drm_mm_node *node) struct drm_mm *mm = node-mm; struct drm_mm_node *prev_node; + if (WARN_ON(!node-allocated)) + return; + BUG_ON(node-scanned_block || node-scanned_prev_free || node-scanned_next_free); -- 1.8.3.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/5] drm/i915: Remove node only when allocated
In upcoming code, it will be possible for a vma to have been created, but no space reserved for it in the address space. The drm_mm semantics are such that trying to remove an unallocated node is not allowed. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a60c773..c287072 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2629,7 +2629,8 @@ int i915_vma_unbind(struct i915_vma *vma) if (i915_is_ggtt(vma-vm)) obj-map_and_fenceable = true; - drm_mm_remove_node(vma-node); + if (drm_mm_node_allocated(vma-node)) + drm_mm_remove_node(vma-node); i915_gem_vma_destroy(vma); /* Since the unbound list is global, only move to that list if -- 1.8.3.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/5] drm/i915: bind mf cleanup
Cleanup the map and fenceable setting during bind to make more sense, and not check i915_is_ggtt() 2 unnecessary times Recommended-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b91a7f0..a60c773 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3173,18 +3173,16 @@ search_free: list_move_tail(obj-global_list, dev_priv-mm.bound_list); list_add_tail(vma-mm_list, vm-inactive_list); - fenceable = - i915_is_ggtt(vm) - i915_gem_obj_ggtt_size(obj) == fence_size - (i915_gem_obj_ggtt_offset(obj) (fence_alignment - 1)) == 0; + if (i915_is_ggtt(vm)) { + fenceable = + i915_gem_obj_ggtt_size(obj) == fence_size + (i915_gem_obj_ggtt_offset(obj) (fence_alignment - 1)) == 0; - mappable = - i915_is_ggtt(vm) - vma-node.start + obj-base.size = dev_priv-gtt.mappable_end; + mappable = + vma-node.start + obj-base.size = dev_priv-gtt.mappable_end; - /* Map and fenceable only changes if the VM is the global GGTT */ - if (i915_is_ggtt(vm)) obj-map_and_fenceable = mappable fenceable; + } WARN_ON(map_and_fenceable !obj-map_and_fenceable); -- 1.8.3.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/5] drm/i915: WARN_ON failed map_and_fenceable
I just noticed in our code we don't really check the assertion, and given some of the code I am changing in this area, I feel a WARN is very nice to have. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 498ef8a..b91a7f0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3186,6 +3186,8 @@ search_free: if (i915_is_ggtt(vm)) obj-map_and_fenceable = mappable fenceable; + WARN_ON(map_and_fenceable !obj-map_and_fenceable); + trace_i915_vma_bind(vma, map_and_fenceable); i915_gem_verify_gtt(dev); return 0; -- 1.8.3.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/5] drm/i915: Convert execbuf code to use vmas
From: Ben Widawsky b...@bwidawsk.net In order to transition more of our code over to using a VMA instead of an OBJ, VM pair - we must have the vma accessible at execbuf time. Up until now, we've only had a VMA when actually binding an object. The previous patch helped handle the distinction on bound vs. unbound. This patch will help us catch leaks, and other issues before we actually shuffle a bunch of stuff around. This attempts to convert all the execbuf code to speak in vmas. Since the execbuf code is very self contained it was a nice isolated conversion. The meat of the code is about turning eb_objects into eb_vma, and then wiring up the rest of the code to use vmas instead of obj, vm pairs. Unfortunately, to do this, we must move the exec_list link from the obj structure. This list is reused in the eviction code, so we must also modify the eviction code to make this work. v2: Release table lock early, and two a 2 phase vma lookup to avoid having to use a GFP_ATOMIC. (Chris) v3: s/obj_exec_list/obj_exec_link/ Updates to address commit 6d2b888569d366beb4be72cacfde41adee2c25e1 Author: Chris Wilson ch...@chris-wilson.co.uk Date: Wed Aug 7 18:30:54 2013 +0100 drm/i915: List objects allocated from stolen memory in debugfs Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_debugfs.c| 12 +- drivers/gpu/drm/i915/i915_drv.h| 25 ++- drivers/gpu/drm/i915/i915_gem.c| 28 ++- drivers/gpu/drm/i915/i915_gem_evict.c | 31 ++- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 328 - 5 files changed, 235 insertions(+), 189 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 1a87cc9..1056747 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -193,9 +193,9 @@ static int obj_rank_by_stolen(void *priv, struct list_head *A, struct list_head *B) { struct drm_i915_gem_object *a = - container_of(A, struct drm_i915_gem_object, exec_list); + container_of(A, struct drm_i915_gem_object, obj_exec_link); struct drm_i915_gem_object *b = - container_of(B, struct drm_i915_gem_object, exec_list); + container_of(B, struct drm_i915_gem_object, obj_exec_link); return a-stolen-start - b-stolen-start; } @@ -219,7 +219,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (obj-stolen == NULL) continue; - list_add(obj-exec_list, stolen); + list_add(obj-obj_exec_link, stolen); total_obj_size += obj-base.size; total_gtt_size += i915_gem_obj_ggtt_size(obj); @@ -229,7 +229,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (obj-stolen == NULL) continue; - list_add(obj-exec_list, stolen); + list_add(obj-obj_exec_link, stolen); total_obj_size += obj-base.size; count++; @@ -237,11 +237,11 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) list_sort(NULL, stolen, obj_rank_by_stolen); seq_puts(m, Stolen:\n); while (!list_empty(stolen)) { - obj = list_first_entry(stolen, typeof(*obj), exec_list); + obj = list_first_entry(stolen, typeof(*obj), obj_exec_link); seq_puts(m,); describe_obj(m, obj); seq_putc(m, '\n'); - list_del_init(obj-exec_list); + list_del_init(obj-obj_exec_link); } mutex_unlock(dev-struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6141253..6bbf1e7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -562,6 +562,17 @@ struct i915_vma { struct list_head mm_list; struct list_head vma_link; /* Link in the object's VMA list */ + + /** This vma's place in the batchbuffer or on the eviction list */ + struct list_head exec_list; + + /** +* Used for performing relocations during execbuffer insertion. +*/ + struct hlist_node exec_node; + unsigned long exec_handle; + struct drm_i915_gem_exec_object2 *exec_entry; + }; struct i915_ctx_hang_stats { @@ -1311,8 +1322,8 @@ struct drm_i915_gem_object { struct list_head global_list; struct list_head ring_list; - /** This object's place in the batchbuffer or on the eviction list */ - struct list_head exec_list; + /** Used in execbuf to temporarily hold a ref */ + struct list_head obj_exec_link; /** * This is set if the object is on the active lists (has pending @@ -1397,13 +1408,6 @@ struct drm_i915_gem_object { void *dma_buf_vmapping; int