Re: [Intel-gfx] [PATCH] intel-decode: Fix gen6 HIER_DEPTH_BUFFER decoding
Daniel Vetter daniel.vet...@ffwll.ch writes: It accidentally used the cmd id for the gen7 command and had an outdated lenght field. Spotted while trying to make sense of an ivb error_state from mesa 7.11 ... Reviewed-by: Eric Anholt e...@anholt.net pgpFNtB_upuPa.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 12/13] drm/i915: fix DP get_hw_state return value
On Thu, Apr 4, 2013 at 2:12 AM, Jesse Barnes jbar...@virtuousgeek.org wrote: On Thu, 4 Apr 2013 01:15:28 +0200 Daniel Vetter dan...@ffwll.ch wrote: On Tue, Apr 02, 2013 at 08:11:05PM +0200, Daniel Vetter wrote: On Tue, Apr 02, 2013 at 10:03:56AM -0700, Jesse Barnes wrote: If we couldn't find a pipe we shouldn't return true. This might be even better as a WARN though, since it should be impossible to have the port enabled without a pipe selected. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org These two fixes are merged for -next, thanks. Actually this one here is broken, so I've had to revert it. What failed? How is it possible we'd have a DP port without a pipe? Every pattern in the register field should correspond to a pipe right? Review failed on my side - you've changed the return which is used by all the success cases ... There's another return for one failure case, and the no-pipe one just falls through. The only case this patch did _not_ break is pch ports on cpt/ppt. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: GFDT support for SNB/IVB
On Wed, Mar 06, 2013 at 04:28:09PM +, Chris Wilson wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Currently all scanout buffers must be uncached because the display controller doesn't snoop the LLC. SNB introduced another method to guarantee coherency for the display controller. It's called the GFDT or graphics data type. Pages that have the GFDT bit enabled in their PTEs get flushed all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is issued with the synchronize GFDT bit set. So rather than making all scanout buffers uncached, set the GFDT bit in their PTEs, and modify the ring flush functions to enable the synchronize GFDT bit. On HSW the GFDT bit was removed from the PTE, and it's only present in surface state, so we can't really set it from the kernel. Also the docs state that the hardware isn't actually guaranteed to respect the GFDT bit. So it looks like GFDT isn't all that useful on HSW. So far I've tried this very quickly on an IVB machine, and it seems to be working as advertised. No idea if it does any good though. On an i5-2520m (laptop) running gnome-shell at 1366x768: padman 140.78 - 145.98 fps openarena 183.72 - 186.87 fps gtkperf ComboBoxEntry 20.27 - 22.14s gtkperf pixbuf 1.12 - 1.47s x11perf -aa10text 13.40 - 13.20 Mglyphs which are well within the throttling noise. v2 [ickle]: adapt to comply with existing userspace guarantees Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Oops, this one here somehow fell a bit through the cracks. Two bikesheds and one real issue: - s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect more fun to come here ;-) - ring-flush has a pile of arguments, I guess we should coalesce invalidate/flush and the new internal into a simple flags array. I don't expect that we'll every switch invalidate/flush back to domain arrays from the current it's just a bool, really usage. Also, the above should be done in separate prep patches imo. The real deal is flushing cpu access, since that probably does not set the gfdt flag. So cpu reads are fine and don't require any flushing, but cpu writes must be clflushed I think. In a way gfdt works as if the gpu is doing write-through caching (safe that we have to manually initiate the write-through with the gfdt flush). But from the pov of cpu access it's the same, and hence requires the same amount of clflushing. Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc for gfdt we still need to keep track of the gfdt_dirty state, but all the cpu side flushing business should be much clearer. Stumbled over this while checking whether sw_finish_ioctl would still do the right thing, and noticed that the clflush is a noop when gfdt is treated like the current CACHE_LLC. Cheers, Daniel PS: Ben Widawsky noticed that on ivb we set a stupid ppgtt pte caching override, which results in pte cachelines being tagged with gfdt. Hsw has an equivalent mode to allow the gpu to write through. I have no idea what's the point of this and we never write ptes from the gpu. So can you please throw the relevant patch on top to disable gfdt for ppgtt ptes in ECOCHK? -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: GFDT support for SNB/IVB
On Thu, Apr 04, 2013 at 11:01:02AM +0200, Daniel Vetter wrote: On Wed, Mar 06, 2013 at 04:28:09PM +, Chris Wilson wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Currently all scanout buffers must be uncached because the display controller doesn't snoop the LLC. SNB introduced another method to guarantee coherency for the display controller. It's called the GFDT or graphics data type. Pages that have the GFDT bit enabled in their PTEs get flushed all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is issued with the synchronize GFDT bit set. So rather than making all scanout buffers uncached, set the GFDT bit in their PTEs, and modify the ring flush functions to enable the synchronize GFDT bit. On HSW the GFDT bit was removed from the PTE, and it's only present in surface state, so we can't really set it from the kernel. Also the docs state that the hardware isn't actually guaranteed to respect the GFDT bit. So it looks like GFDT isn't all that useful on HSW. So far I've tried this very quickly on an IVB machine, and it seems to be working as advertised. No idea if it does any good though. On an i5-2520m (laptop) running gnome-shell at 1366x768: padman140.78 - 145.98 fps openarena 183.72 - 186.87 fps gtkperf ComboBoxEntry 20.27 - 22.14s gtkperf pixbuf 1.12 - 1.47s x11perf -aa10text 13.40 - 13.20 Mglyphs which are well within the throttling noise. v2 [ickle]: adapt to comply with existing userspace guarantees Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Oops, this one here somehow fell a bit through the cracks. Two bikesheds and one real issue: - s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect more fun to come here ;-) - ring-flush has a pile of arguments, I guess we should coalesce invalidate/flush and the new internal into a simple flags array. I don't expect that we'll every switch invalidate/flush back to domain arrays from the current it's just a bool, really usage. Also, the above should be done in separate prep patches imo. The real deal is flushing cpu access, since that probably does not set the gfdt flag. So cpu reads are fine and don't require any flushing, but cpu writes must be clflushed I think. In a way gfdt works as if the gpu is doing write-through caching (safe that we have to manually initiate the write-through with the gfdt flush). But from the pov of cpu access it's the same, and hence requires the same amount of clflushing. Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc for gfdt we still need to keep track of the gfdt_dirty state, but all the cpu side flushing business should be much clearer. Stumbled over this while checking whether sw_finish_ioctl would still do the right thing, and noticed that the clflush is a noop when gfdt is treated like the current CACHE_LLC. My original patch had a change to clflushing where pinned objects w/ gfdt were also flushed. I didn't really read v2 well enough to figure out why that got dropped. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't override PPGTT cacheability on HSW
On Wed, Apr 03, 2013 at 10:08:26PM +0200, Daniel Vetter wrote: On Wed, Apr 3, 2013 at 9:33 PM, Daniel Vetter dan...@ffwll.ch wrote: So I've checked hsw bspec and the problem is that hw guys again changed the bits around a bit, and I think on HSW we actually want (0x8 3) instead of what's currently there. Meh, I've screwed up reading the tables, 0x3 3 is what we imo want, so nothing needs to be changed. Sorry for the confusion. Shouldn't it be (13) on IVB (for just LLC w/o GFDT), and (33) on the rest? Also what about the GAC_ECO_BITS register? BSpec tells me it exists on IVB and HSW as well. It also seems to have a bit very similar to ECOCHK_SNB_BIT but we don't actually set it on SNB. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] drm/i915: Add ECOBITS_SNB_BIT
From: Ville Syrjälä ville.syrj...@linux.intel.com GAC_ECO_BITS has a bit similar to GAM_ECOCHK's ECOCHK_SNB_BIT. Add the define, and enable it on SNB. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_gtt.c | 3 ++- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 24a23b3..593137b 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -312,7 +312,8 @@ void i915_gem_init_ppgtt(struct drm_device *dev) uint32_t ecochk, gab_ctl, ecobits; ecobits = I915_READ(GAC_ECO_BITS); - I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_PPGTT_CACHE64B); + I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_SNB_BIT | +ECOBITS_PPGTT_CACHE64B); gab_ctl = I915_READ(GAB_CTL); I915_WRITE(GAB_CTL, gab_ctl | GAB_CTL_CONT_AFTER_PAGEFAULT); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 058686c..4b8fd4d 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -127,6 +127,7 @@ #define ECOCHK_PPGTT_CACHE4B (0x03) #define GAC_ECO_BITS 0x14090 +#define ECOBITS_SNB_BIT (113) #define ECOBITS_PPGTT_CACHE64B (38) #define ECOBITS_PPGTT_CACHE4B(08) -- 1.8.1.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] drm/i915: Configure GAM_ECOCHK appropriatly for Gen7
From: Ville Syrjälä ville.syrj...@linux.intel.com IVB and HSW use different encodings for the PPGTT cacheability bits in the GAM_ECOCHK register. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +-- drivers/gpu/drm/i915/i915_reg.h | 5 + 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a851362..4d86fe4 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -323,12 +323,19 @@ void i915_gem_init_ppgtt(struct drm_device *dev) ECOCHK_PPGTT_CACHE64B); I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE)); } else if (INTEL_INFO(dev)-gen = 7) { - uint32_t ecobits; + uint32_t ecochk, ecobits; ecobits = I915_READ(GAC_ECO_BITS); I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_PPGTT_CACHE64B); - I915_WRITE(GAM_ECOCHK, ECOCHK_PPGTT_CACHE64B); + ecochk = I915_READ(GAM_ECOCHK); + if (IS_HASWELL(dev)) { + ecochk |= ECOCHK_PPGTT_WB_HSW; + } else { + ecochk |= ECOCHK_PPGTT_LLC_IVB; + ecochk = ~ECOCHK_PPGTT_GFDT_IVB; + } + I915_WRITE(GAM_ECOCHK, ecochk); /* GFX_MODE is per-ring on gen7+ */ } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 4b8fd4d..44051fa 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -125,6 +125,11 @@ #define HSW_ECOCHK_ARB_PRIO_SOL (16) #define ECOCHK_PPGTT_CACHE64B(0x33) #define ECOCHK_PPGTT_CACHE4B (0x03) +#define ECOCHK_PPGTT_GFDT_IVB(0x14) +#define ECOCHK_PPGTT_LLC_IVB (0x13) +#define ECOCHK_PPGTT_UC_HSW (0x13) +#define ECOCHK_PPGTT_WT_HSW (0x23) +#define ECOCHK_PPGTT_WB_HSW (0x33) #define GAC_ECO_BITS 0x14090 #define ECOBITS_SNB_BIT (113) -- 1.8.1.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: GFDT support for SNB/IVB
On Thu, Apr 04, 2013 at 02:27:12PM +0300, Ville Syrjälä wrote: On Thu, Apr 04, 2013 at 11:01:02AM +0200, Daniel Vetter wrote: On Wed, Mar 06, 2013 at 04:28:09PM +, Chris Wilson wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Currently all scanout buffers must be uncached because the display controller doesn't snoop the LLC. SNB introduced another method to guarantee coherency for the display controller. It's called the GFDT or graphics data type. Pages that have the GFDT bit enabled in their PTEs get flushed all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is issued with the synchronize GFDT bit set. So rather than making all scanout buffers uncached, set the GFDT bit in their PTEs, and modify the ring flush functions to enable the synchronize GFDT bit. On HSW the GFDT bit was removed from the PTE, and it's only present in surface state, so we can't really set it from the kernel. Also the docs state that the hardware isn't actually guaranteed to respect the GFDT bit. So it looks like GFDT isn't all that useful on HSW. So far I've tried this very quickly on an IVB machine, and it seems to be working as advertised. No idea if it does any good though. On an i5-2520m (laptop) running gnome-shell at 1366x768: padman 140.78 - 145.98 fps openarena 183.72 - 186.87 fps gtkperf ComboBoxEntry 20.27 - 22.14s gtkperf pixbuf 1.12 - 1.47s x11perf -aa10text 13.40 - 13.20 Mglyphs which are well within the throttling noise. v2 [ickle]: adapt to comply with existing userspace guarantees Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Oops, this one here somehow fell a bit through the cracks. Two bikesheds and one real issue: - s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect more fun to come here ;-) - ring-flush has a pile of arguments, I guess we should coalesce invalidate/flush and the new internal into a simple flags array. I don't expect that we'll every switch invalidate/flush back to domain arrays from the current it's just a bool, really usage. Also, the above should be done in separate prep patches imo. The real deal is flushing cpu access, since that probably does not set the gfdt flag. So cpu reads are fine and don't require any flushing, but cpu writes must be clflushed I think. In a way gfdt works as if the gpu is doing write-through caching (safe that we have to manually initiate the write-through with the gfdt flush). But from the pov of cpu access it's the same, and hence requires the same amount of clflushing. Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc for gfdt we still need to keep track of the gfdt_dirty state, but all the cpu side flushing business should be much clearer. Stumbled over this while checking whether sw_finish_ioctl would still do the right thing, and noticed that the clflush is a noop when gfdt is treated like the current CACHE_LLC. My original patch had a change to clflushing where pinned objects w/ gfdt were also flushed. I didn't really read v2 well enough to figure out why that got dropped. Because it was the wrong approach. The API where the scanout is to be made coherent with CPU mmaps is sw_finish. The only user of that is early pre-gen5 DDX. So imo the original patch was abusing a generic flag and applying the flushes at the wrong points, and I simply didn't care about badly designed API such as sw_finish. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: revert eDP bpp clamping code changes
The behaviour around handling the eDP bpp value from vbt has been slightly changed in commit 3600836585e3fdef0a1410d63fe5ce4015007aac Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:59 2013 +0100 drm/i915: convert DP autodither code to new infrastructure The old behaviour was that we used the plane's bpp (usually 24bpp) for computing the dp link bw, but set up the pipe with the bpp value from vbt if available. The takes the vbt bpp override into account even for the dp link bw configuration. On Paulo's hsw machine this resulted in a slower link clock and a black screen - but the mode actually /should/ fit even with the lower clock. Until we've cleared up simply stay bug-for-bug compatible with the old code. While at it, also restore a debug message lost in: commit 4e53c2e010e531b4a014692199e978482d471c7e Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:58 2013 +0100 drm/i915: precompute pipe bpp before touching the hw Cc: Paulo Zanoni przan...@gmail.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_dp.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index c5cfec3..658d071 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -741,9 +741,6 @@ intel_dp_compute_config(struct intel_encoder *encoder, /* Walk through all bpp values. Luckily they're all nicely spaced with 2 * bpc in between. */ bpp = min_t(int, 8*3, pipe_config-pipe_bpp); - if (is_edp(intel_dp) dev_priv-edp.bpp) - bpp = min_t(int, bpp, dev_priv-edp.bpp); - for (; bpp = 6*3; bpp -= 2*3) { mode_rate = intel_dp_link_required(target_clock, bpp); @@ -781,7 +778,6 @@ found: intel_dp-link_bw = bws[clock]; intel_dp-lane_count = lane_count; adjusted_mode-clock = drm_dp_bw_code_to_link_rate(intel_dp-link_bw); - pipe_config-pipe_bpp = bpp; pipe_config-pixel_target_clock = target_clock; DRM_DEBUG_KMS(DP link bw %02x lane count %d clock %d bpp %d\n, @@ -796,6 +792,20 @@ found: intel_dp_set_clock(encoder, pipe_config, intel_dp-link_bw); + /* +* XXX: We have a strange regression where using the vbt edp bpp value +* for the link bw computation results in black screens, the panel only +* works when we do the computation at the usual 24bpp (but still +* requires us to use 18bpp. Until that's fully debugged, stay +* bug-for-bug compatible with the old code. +*/ + if (is_edp(intel_dp) dev_priv-edp.bpp) { + DRM_DEBUG_KMS(clamping display bpc (was %d) to eDP (%d)\n, + bpp, dev_priv-edp.bpp); + bpp = min_t(int, bpp, dev_priv-edp.bpp); + } + pipe_config-pipe_bpp = bpp; + return true; } -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 04/16] drm/i915: pass seqno to i915_hangcheck_ring_idle
In preparation for next commit, pass seqno as a parameter to i915_hangcheck_ring_idle as it will be used inside i915_hangcheck_elapsed. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4c5bdd0..45c83fb 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1846,11 +1846,11 @@ ring_last_seqno(struct intel_ring_buffer *ring) struct drm_i915_gem_request, list)-seqno; } -static bool i915_hangcheck_ring_idle(struct intel_ring_buffer *ring, bool *err) +static bool i915_hangcheck_ring_idle(struct intel_ring_buffer *ring, +u32 ring_seqno, bool *err) { if (list_empty(ring-request_list) || - i915_seqno_passed(ring-get_seqno(ring, false), - ring_last_seqno(ring))) { + i915_seqno_passed(ring_seqno, ring_last_seqno(ring))) { /* Issue a wake-up to catch stuck h/w. */ if (waitqueue_active(ring-irq_queue)) { DRM_ERROR(Hangcheck timer elapsed... %s idle\n, @@ -1967,7 +1967,10 @@ void i915_hangcheck_elapsed(unsigned long data) memset(acthd, 0, sizeof(acthd)); idle = true; for_each_ring(ring, dev_priv, i) { - idle = i915_hangcheck_ring_idle(ring, err); + u32 seqno; + + seqno = ring-get_seqno(ring, false); + idle = i915_hangcheck_ring_idle(ring, seqno, err); acthd[i] = intel_ring_get_active_head(ring); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 03/16] drm/i915: reference count for i915_hw_contexts
In preparation to do analysis of which context was guilty of gpu hung, store kreffed context pointer into request struct. This allows us to inspect contexts when gpu is reset even if those contexts would already be released by userspace. v2: track i915_hw_context pointers instead of using ctx_ids (from Chris Wilson) v3 (Ben): Get rid of do_release() and handle refcounting more compactly. (recommended by Chis) v4: kref_* put inside static inlines (Daniel Vetter) remove code duplication on freeing context (Chris Wilson) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com (v2) Signed-off-by: Ben Widawsky b...@bwidawsk.net (v3) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com (v4) --- drivers/gpu/drm/i915/i915_drv.h| 20 ++-- drivers/gpu/drm/i915/i915_gem.c| 27 --- drivers/gpu/drm/i915/i915_gem_context.c| 19 ++- drivers/gpu/drm/i915/i915_gem_execbuffer.c |7 --- 4 files changed, 56 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e5e2083..3c85c1b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -456,6 +456,7 @@ struct i915_hw_ppgtt { /* This must match up with the value previously used for execbuf2.rsvd1. */ #define DEFAULT_CONTEXT_ID 0 struct i915_hw_context { + struct kref ref; int id; bool is_initialized; struct drm_i915_file_private *file_priv; @@ -1260,6 +1261,9 @@ struct drm_i915_gem_request { /** Postion in the ringbuffer of the end of the request */ u32 tail; + /** Context related to this request */ + struct i915_hw_context *ctx; + /** Time at which this request was emitted, in jiffies. */ unsigned long emitted_jiffies; @@ -1646,9 +1650,10 @@ int __must_check i915_gpu_idle(struct drm_device *dev); int __must_check i915_gem_idle(struct drm_device *dev); int _i915_add_request(struct intel_ring_buffer *ring, u32 *seqno, - struct drm_file *file); + struct drm_file *file, + struct i915_hw_context *ctx); #define i915_add_request(ring, seqno) \ - _i915_add_request(ring, seqno, NULL) + _i915_add_request(ring, seqno, NULL, NULL) int __must_check i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); @@ -1692,6 +1697,17 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); struct i915_hw_context * __must_check i915_switch_context(struct intel_ring_buffer *ring, struct drm_file *file, int to_id); +void i915_gem_context_free(struct kref *ctx_ref); +static inline void i915_gem_context_reference(struct i915_hw_context *ctx) +{ + kref_get(ctx-ref); +} + +static inline void i915_gem_context_unreference(struct i915_hw_context *ctx) +{ + kref_put(ctx-ref, i915_gem_context_free); +} + int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d3ce0a7..f586f9c4 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2001,7 +2001,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) int _i915_add_request(struct intel_ring_buffer *ring, u32 *out_seqno, - struct drm_file *file) + struct drm_file *file, + struct i915_hw_context *ctx) { drm_i915_private_t *dev_priv = ring-dev-dev_private; struct drm_i915_gem_request *request; @@ -2041,6 +2042,11 @@ int _i915_add_request(struct intel_ring_buffer *ring, request-seqno = intel_ring_get_seqno(ring); request-ring = ring; request-tail = request_ring_position; + request-ctx = ctx; + + if (request-ctx) + i915_gem_context_reference(ctx); + request-emitted_jiffies = jiffies; was_empty = list_empty(ring-request_list); list_add_tail(request-list, ring-request_list); @@ -2093,6 +2099,17 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) spin_unlock(file_priv-mm.lock); } +static void i915_gem_free_request(struct drm_i915_gem_request *request) +{ + list_del(request-list); + i915_gem_request_remove_from_client(request); + + if (request-ctx) + i915_gem_context_unreference(request-ctx); + + kfree(request); +} + static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, struct intel_ring_buffer *ring) { @@ -2103,9 +2120,7 @@ static void
[Intel-gfx] [PATCH v3 01/16] drm/i915: return context from i915_switch_context()
In preparation for the next commit, return context that was switched to from i915_switch_context(). v2: context in return value instead of param. (Ben Widawsky) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h|5 +++-- drivers/gpu/drm/i915/i915_gem.c|8 --- drivers/gpu/drm/i915/i915_gem_context.c| 32 +--- drivers/gpu/drm/i915/i915_gem_execbuffer.c |7 -- 4 files changed, 32 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 44fca0b..630982b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1687,8 +1687,9 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, void i915_gem_context_init(struct drm_device *dev); void i915_gem_context_fini(struct drm_device *dev); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); -int i915_switch_context(struct intel_ring_buffer *ring, - struct drm_file *file, int to_id); +struct i915_hw_context * __must_check +i915_switch_context(struct intel_ring_buffer *ring, + struct drm_file *file, int to_id); int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 911bd40..db804cc 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2522,9 +2522,11 @@ int i915_gpu_idle(struct drm_device *dev) /* Flush everything onto the inactive list. */ for_each_ring(ring, dev_priv, i) { - ret = i915_switch_context(ring, NULL, DEFAULT_CONTEXT_ID); - if (ret) - return ret; + struct i915_hw_context *ctx; + + ctx = i915_switch_context(ring, NULL, DEFAULT_CONTEXT_ID); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); ret = intel_ring_idle(ring); if (ret) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 94d873a..ddf26a6 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -417,41 +417,47 @@ static int do_switch(struct i915_hw_context *to) /** * i915_switch_context() - perform a GPU context switch. * @ring: ring for which we'll execute the context switch - * @file_priv: file_priv associated with the context, may be NULL - * @id: context id number - * @seqno: sequence number by which the new context will be switched to - * @flags: + * @file: file associated with the context, may be NULL + * @to_id: context id number + * @return: context that was switched to or ERR_PTR on error * * The context life cycle is simple. The context refcount is incremented and * decremented by 1 and create and destroy. If the context is in use by the GPU, * it will have a refoucnt 1. This allows us to destroy the context abstract * object while letting the normal object tracking destroy the backing BO. */ -int i915_switch_context(struct intel_ring_buffer *ring, - struct drm_file *file, - int to_id) + +struct i915_hw_context * +i915_switch_context(struct intel_ring_buffer *ring, + struct drm_file *file, + int to_id) { struct drm_i915_private *dev_priv = ring-dev-dev_private; - struct i915_hw_context *to; + struct i915_hw_context *to = NULL; + int ret = 0; if (dev_priv-hw_contexts_disabled) - return 0; + return NULL; if (ring != dev_priv-ring[RCS]) - return 0; + return NULL; if (to_id == DEFAULT_CONTEXT_ID) { to = ring-default_context; } else { if (file == NULL) - return -EINVAL; + return ERR_PTR(-EINVAL); to = i915_gem_context_get(file-driver_priv, to_id); if (to == NULL) - return -ENOENT; + return ERR_PTR(-ENOENT); } - return do_switch(to); + ret = do_switch(to); + if (ret) + return ERR_PTR(ret); + + return to; } int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a96b6a3..e0686ca 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -841,6 +841,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_i915_gem_object *batch_obj; struct drm_clip_rect *cliprects = NULL; struct
[Intel-gfx] [PATCH v3 08/16] drm/i915: remove i915_hangcheck_hung
Rework of per ring hangcheck made this obsolete. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |1 - drivers/gpu/drm/i915/i915_irq.c | 21 - 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0404116..5ccfc6c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -815,7 +815,6 @@ struct i915_gpu_error { #define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */ #define DRM_I915_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD) struct timer_list hangcheck_timer; - int hangcheck_count; /* For reset and error_state handling. */ spinlock_t lock; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index a41ab2d..df40bb2 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1929,27 +1929,6 @@ static bool i915_hangcheck_ring_hung(struct intel_ring_buffer *ring) return !kick_ring(ring); } -static bool i915_hangcheck_hung(struct drm_device *dev) -{ - drm_i915_private_t *dev_priv = dev-dev_private; - - if (dev_priv-gpu_error.hangcheck_count++ 1) { - bool hung = true; - struct intel_ring_buffer *ring; - int i; - - DRM_ERROR(Hangcheck timer elapsed... GPU hung\n); - i915_handle_error(dev, true); - - for_each_ring(ring, dev_priv, i) - hung = i915_hangcheck_ring_hung(ring); - - return hung; - } - - return false; -} - /** * This is called when the chip hasn't reported back with completed * batchbuffers in a long time. The first time this is called we simply record -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 05/16] drm/i915: track ring progression using seqnos
Instead of relying in acthd, track ring seqno progression to detect if ring has hung. v2: put hangcheck stuff inside struct (Chris Wilson) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |2 -- drivers/gpu/drm/i915/i915_irq.c | 30 +- drivers/gpu/drm/i915/intel_ringbuffer.h |6 ++ 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3c85c1b..0404116 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -816,8 +816,6 @@ struct i915_gpu_error { #define DRM_I915_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD) struct timer_list hangcheck_timer; int hangcheck_count; - uint32_t last_acthd[I915_NUM_RINGS]; - uint32_t prev_instdone[I915_NUM_INSTDONE_REG]; /* For reset and error_state handling. */ spinlock_t lock; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 45c83fb..0363871 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1956,22 +1956,19 @@ void i915_hangcheck_elapsed(unsigned long data) { struct drm_device *dev = (struct drm_device *)data; drm_i915_private_t *dev_priv = dev-dev_private; - uint32_t acthd[I915_NUM_RINGS], instdone[I915_NUM_INSTDONE_REG]; struct intel_ring_buffer *ring; bool err = false, idle; int i; + u32 seqno[I915_NUM_RINGS]; + bool work_done; if (!i915_enable_hangcheck) return; - memset(acthd, 0, sizeof(acthd)); idle = true; for_each_ring(ring, dev_priv, i) { - u32 seqno; - - seqno = ring-get_seqno(ring, false); - idle = i915_hangcheck_ring_idle(ring, seqno, err); - acthd[i] = intel_ring_get_active_head(ring); + seqno[i] = ring-get_seqno(ring, false); + idle = i915_hangcheck_ring_idle(ring, seqno[i], err); } /* If all work is done then ACTHD clearly hasn't advanced. */ @@ -1987,20 +1984,19 @@ void i915_hangcheck_elapsed(unsigned long data) return; } - i915_get_extra_instdone(dev, instdone); - if (memcmp(dev_priv-gpu_error.last_acthd, acthd, - sizeof(acthd)) == 0 - memcmp(dev_priv-gpu_error.prev_instdone, instdone, - sizeof(instdone)) == 0) { + work_done = false; + for_each_ring(ring, dev_priv, i) { + if (ring-hangcheck.seqno != seqno[i]) { + work_done = true; + ring-hangcheck.seqno = seqno[i]; + } + } + + if (!work_done) { if (i915_hangcheck_hung(dev)) return; } else { dev_priv-gpu_error.hangcheck_count = 0; - - memcpy(dev_priv-gpu_error.last_acthd, acthd, - sizeof(acthd)); - memcpy(dev_priv-gpu_error.prev_instdone, instdone, - sizeof(instdone)); } repeat: diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index d66208c..844381e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -37,6 +37,10 @@ struct intel_hw_status_page { #define I915_READ_SYNC_0(ring) I915_READ(RING_SYNC_0((ring)-mmio_base)) #define I915_READ_SYNC_1(ring) I915_READ(RING_SYNC_1((ring)-mmio_base)) +struct intel_ring_hangcheck { + u32 seqno; +}; + struct intel_ring_buffer { const char *name; enum intel_ring_id { @@ -137,6 +141,8 @@ struct intel_ring_buffer { struct i915_hw_context *default_context; struct drm_i915_gem_object *last_context_obj; + struct intel_ring_hangcheck hangcheck; + void *private; }; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 07/16] drm/i915: detect hang using per ring hangcheck_score
Add per ring score of possible culprit for gpu hang. If ring is busy and not waiting, it will get the highest score across calls to i915_hangcheck_elapsed. This way we are most likely to find the ring that caused the hang among the waiting ones. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 63 --- drivers/gpu/drm/i915/intel_ringbuffer.h |1 + 2 files changed, 34 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7342a96..a41ab2d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -413,7 +413,6 @@ static void notify_ring(struct drm_device *dev, wake_up_all(ring-irq_queue); if (i915_enable_hangcheck) { - dev_priv-gpu_error.hangcheck_count = 0; mod_timer(dev_priv-gpu_error.hangcheck_timer, round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); } @@ -1962,52 +1961,56 @@ void i915_hangcheck_elapsed(unsigned long data) struct drm_device *dev = (struct drm_device *)data; drm_i915_private_t *dev_priv = dev-dev_private; struct intel_ring_buffer *ring; - bool err = false, idle; int i; - u32 seqno[I915_NUM_RINGS]; - bool work_done; + int busy_count = 0, rings_hung = 0; if (!i915_enable_hangcheck) return; - idle = true; for_each_ring(ring, dev_priv, i) { - seqno[i] = ring-get_seqno(ring, false); - idle = i915_hangcheck_ring_idle(ring, seqno[i], err); - } + u32 seqno; + bool idle, err = false; + + seqno = ring-get_seqno(ring, false); + idle = i915_hangcheck_ring_idle(ring, seqno, err); - /* If all work is done then ACTHD clearly hasn't advanced. */ - if (idle) { - if (err) { - if (i915_hangcheck_hung(dev)) - return; + if (idle) { + if (err) + ring-hangcheck.score++; + else + ring-hangcheck.score = 0; + } else { + busy_count++; - goto repeat; + if (ring-hangcheck.seqno == seqno) { + ring-hangcheck.score++; + + /* Kick ring */ + i915_hangcheck_ring_hung(ring); + } else { + ring-hangcheck.score = 0; + } } - dev_priv-gpu_error.hangcheck_count = 0; - return; + ring-hangcheck.seqno = seqno; } - work_done = false; for_each_ring(ring, dev_priv, i) { - if (ring-hangcheck.seqno != seqno[i]) { - work_done = true; - ring-hangcheck.seqno = seqno[i]; + if (ring-hangcheck.score 1) { + rings_hung++; + DRM_ERROR(%s seems hung\n, ring-name); } } - if (!work_done) { - if (i915_hangcheck_hung(dev)) - return; - } else { - dev_priv-gpu_error.hangcheck_count = 0; - } + if (rings_hung) + return i915_handle_error(dev, true); -repeat: - /* Reset timer case chip hangs without another request being added */ - mod_timer(dev_priv-gpu_error.hangcheck_timer, - round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES)); + if (busy_count) + /* Reset timer case chip hangs without another request +* being added */ + mod_timer(dev_priv-gpu_error.hangcheck_timer, + round_jiffies_up(jiffies + + DRM_I915_HANGCHECK_JIFFIES)); } /* drm_dma.h hooks diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 844381e..503e913 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -39,6 +39,7 @@ struct intel_hw_status_page { struct intel_ring_hangcheck { u32 seqno; + int score; }; struct intel_ring_buffer { -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 15/16] drm/i915: add i915_reset_count
Use reset_counter to track how many actual resets have been done. Reset in progress is enough to increment the counter. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |9 +++-- drivers/gpu/drm/i915/i915_irq.c |2 +- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 30ba79c..e682077 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -867,7 +867,7 @@ struct i915_gpu_error { * being true. */ #define I915_RESET_IN_PROGRESS_FLAG1 -#define I915_WEDGED0x +#define I915_WEDGED(1 31) /** * Waitqueue to signal when the reset has completed. Used by clients @@ -1651,7 +1651,12 @@ static inline bool i915_reset_in_progress(struct i915_gpu_error *error) static inline bool i915_terminally_wedged(struct i915_gpu_error *error) { - return atomic_read(error-reset_counter) == I915_WEDGED; + return atomic_read(error-reset_counter) I915_WEDGED; +} + +static inline u32 i915_reset_count(struct i915_gpu_error *error) +{ + return ((atomic_read(error-reset_counter) (~I915_WEDGED)) + 1) / 2; } void i915_gem_reset(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5acc46e..e7358de 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -987,7 +987,7 @@ static void i915_error_work_func(struct work_struct *work) kobject_uevent_env(dev-primary-kdev.kobj, KOBJ_CHANGE, reset_done_event); } else { - atomic_set(error-reset_counter, I915_WEDGED); + atomic_set_mask(I915_WEDGED, error-reset_counter); } for_each_ring(ring, dev_priv, i) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 12/16] drm/i915: mark rings which were waiting when hang happened
For guilty batchbuffer analysis later on on ring resets, mark all waiting rings so that we can skip them when trying to find a true culprit for the gpu hang. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c |3 ++- drivers/gpu/drm/i915/intel_ringbuffer.h |1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index df40bb2..5acc46e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1965,7 +1965,8 @@ void i915_hangcheck_elapsed(unsigned long data) ring-hangcheck.score++; /* Kick ring */ - i915_hangcheck_ring_hung(ring); + ring-hangcheck.was_waiting = + !i915_hangcheck_ring_hung(ring); } else { ring-hangcheck.score = 0; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 503e913..49c71ff 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -40,6 +40,7 @@ struct intel_hw_status_page { struct intel_ring_hangcheck { u32 seqno; int score; + bool was_waiting; }; struct intel_ring_buffer { -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 13/16] drm/i915: find guilty batch buffer on ring resets
After hang check timer has declared gpu to be hang, rings are reset. In ring reset, when clearing request list, do post mortem analysis to find out the guilty batch buffer. Select requests for further analysis by inspecting the completed sequence number which has been updated into the HWS page. If request was completed, it can't be related to the hang. For noncompleted requests mark the batch as guilty if the ring was not waiting and the ring head was stuck inside the buffer object or in the flush region right after the batch. For everything else, mark them as innocents. v2: Fixed a typo in commit message (Ville Syrjälä) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_gem.c | 88 +++ 1 file changed, 88 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 547eaf5..475b6ad 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2103,6 +2103,85 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) spin_unlock(file_priv-mm.lock); } +static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj) +{ + if (acthd = obj-gtt_offset + acthd obj-gtt_offset + obj-base.size) + return true; + + return false; +} + +static bool i915_head_inside_request(u32 acthd, u32 rs, u32 re) +{ + if (rs re) { + if (acthd = rs acthd re) + return true; + } else if (rs re) { + if (acthd = rs || acthd re) + return true; + } + + return false; +} + +static bool i915_request_guilty(struct drm_i915_gem_request *request, + const u32 acthd, bool *inside) +{ + if (request-batch_obj) { + if (i915_head_inside_object(acthd, request-batch_obj)) { + *inside = true; + return true; + } + } + + if (i915_head_inside_request(acthd, request-head, request-tail)) { + *inside = false; + return true; + } + + return false; +} + +static void i915_set_reset_status(struct intel_ring_buffer *ring, + struct drm_i915_gem_request *request, + u32 acthd) +{ + struct i915_ctx_hang_stats *hs = NULL; + bool inside, guilty; + + /* Innocent until proven guilty */ + guilty = false; + + if (!ring-hangcheck.was_waiting + i915_request_guilty(request, acthd, inside)) { + DRM_ERROR(%s hung %s bo (0x%x ctx %d) at 0x%x\n, + ring-name, + inside ? inside : flushing, + request-batch_obj ? + request-batch_obj-gtt_offset : 0, + request-ctx ? request-ctx-id : 0, + acthd); + + guilty = true; + } + + /* If contexts are disabled or this is the default context, use +* file_priv-reset_state +*/ + if (request-ctx request-ctx-id != DEFAULT_CONTEXT_ID) + hs = request-ctx-hang_stats; + else if (request-file_priv) + hs = request-file_priv-hang_stats; + + if (hs) { + if (guilty) + hs-batch_active++; + else + hs-batch_pending++; + } +} + static void i915_gem_free_request(struct drm_i915_gem_request *request) { list_del(request-list); @@ -2117,6 +2196,12 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, struct intel_ring_buffer *ring) { + u32 completed_seqno; + u32 acthd; + + acthd = intel_ring_get_active_head(ring); + completed_seqno = ring-get_seqno(ring, false); + while (!list_empty(ring-request_list)) { struct drm_i915_gem_request *request; @@ -2124,6 +2209,9 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, struct drm_i915_gem_request, list); + if (request-seqno completed_seqno) + i915_set_reset_status(ring, request, acthd); + i915_gem_free_request(request); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 14/16] drm/i915: refuse to submit more batchbuffers from guilty context
If context has recently submitted a faulty batchbuffers guilty of gpu hang and decides to keep submitting more crap, ban it permanently. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.c| 23 ++- drivers/gpu/drm/i915/i915_drv.h|7 +++ drivers/gpu/drm/i915/i915_gem.c|7 +-- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 + 4 files changed, 47 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a5b8aa9..0928f11 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -852,6 +852,8 @@ int intel_gpu_reset(struct drm_device *dev) int i915_reset(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev-dev_private; + struct i915_ctx_hang_stats *hs; + bool do_wedge = true; int ret; if (!i915_try_reset) @@ -859,10 +861,29 @@ int i915_reset(struct drm_device *dev) mutex_lock(dev-struct_mutex); + /* i915_gem_reset will set this if it finds guilty context */ + dev_priv-gpu_error.hang_stats = NULL; + i915_gem_reset(dev); + hs = dev_priv-gpu_error.hang_stats; + + if (hs) { + if (hs-batch_active == 1) { + do_wedge = false; + } else if (!hs-banned + get_seconds() - hs-batch_active_reset_ts 5) { + hs-banned = true; + do_wedge = false; + } + + hs-batch_active_reset_ts = get_seconds(); + } + + dev_priv-gpu_error.hang_stats = NULL; + ret = -ENODEV; - if (get_seconds() - dev_priv-gpu_error.last_reset 5) + if (do_wedge get_seconds() - dev_priv-gpu_error.last_reset 5) DRM_ERROR(GPU hanging too fast, declaring wedged!\n); else ret = intel_gpu_reset(dev); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8223908..30ba79c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -458,6 +458,12 @@ struct i915_ctx_hang_stats { /* This context had batch active when hang was declared */ unsigned batch_active; + + /* Time when this context was last blamed for a GPU reset */ + unsigned long batch_active_reset_ts; + + /* This context is banned to submit more work */ + bool banned; }; /* This must match up with the value previously used for execbuf2.rsvd1. */ @@ -831,6 +837,7 @@ struct i915_gpu_error { struct work_struct work; unsigned long last_reset; + struct i915_ctx_hang_stats *hang_stats; /** * State variable and reset counter controlling the reset flow diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 475b6ad..ca5c9c3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2147,6 +2147,7 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, struct drm_i915_gem_request *request, u32 acthd) { + struct drm_i915_private *dev_priv = ring-dev-dev_private; struct i915_ctx_hang_stats *hs = NULL; bool inside, guilty; @@ -2175,10 +2176,12 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, hs = request-file_priv-hang_stats; if (hs) { - if (guilty) + if (guilty) { hs-batch_active++; - else + dev_priv-gpu_error.hang_stats = hs; + } else { hs-batch_pending++; + } } } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index bd1750a..f1b1ea9 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -844,6 +844,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_clip_rect *cliprects = NULL; struct intel_ring_buffer *ring; struct i915_hw_context *ctx; + struct i915_ctx_hang_stats *hs; u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 exec_start, exec_len; u32 mask, flags; @@ -1026,6 +1027,18 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; + hs = i915_gem_context_get_hang_stats(dev_priv-ring[RCS], +file, ctx_id); + if (IS_ERR(hs)) { + ret = PTR_ERR(hs); + goto err; + } + + if (hs-banned) { + ret = -EIO; + goto err; + } + ctx = i915_switch_context(ring, file, ctx_id); if (IS_ERR(ctx)) { ret = PTR_ERR(ctx); --
[Intel-gfx] [PATCH v3 11/16] drm/i915: add batch object and context to i915_add_request()
In order to track down a batch buffer and context which caused the ring to hang, store reference to bo and context into the request struct. Request can also cause gpu to hang after the batch in the flush section in the ring. To detect this add start of the flush portion offset into the request. Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h| 13 ++--- drivers/gpu/drm/i915/i915_gem.c|8 ++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c |7 --- 3 files changed, 20 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3863bd4..8223908 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1263,9 +1263,15 @@ struct drm_i915_gem_request { /** GEM sequence number associated with this request. */ uint32_t seqno; - /** Postion in the ringbuffer of the end of the request */ + /** Position in the ringbuffer of the start of the request */ + u32 head; + + /** Position in the ringbuffer of the end of the request */ u32 tail; + /** Batch buffer related to this request if any */ + struct drm_i915_gem_object *batch_obj; + /** Context related to this request */ struct i915_hw_context *ctx; @@ -1658,9 +1664,10 @@ int __must_check i915_gem_idle(struct drm_device *dev); int _i915_add_request(struct intel_ring_buffer *ring, u32 *seqno, struct drm_file *file, - struct i915_hw_context *ctx); + struct i915_hw_context *ctx, + struct drm_i915_gem_object *batch_obj); #define i915_add_request(ring, seqno) \ - _i915_add_request(ring, seqno, NULL, NULL) + _i915_add_request(ring, seqno, NULL, NULL, NULL) int __must_check i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f586f9c4..547eaf5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2002,14 +2002,16 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) int _i915_add_request(struct intel_ring_buffer *ring, u32 *out_seqno, struct drm_file *file, - struct i915_hw_context *ctx) + struct i915_hw_context *ctx, + struct drm_i915_gem_object *obj) { drm_i915_private_t *dev_priv = ring-dev-dev_private; struct drm_i915_gem_request *request; - u32 request_ring_position; + u32 request_ring_position, request_start; int was_empty; int ret; + request_start = intel_ring_get_tail(ring); /* * Emit any outstanding flushes - execbuf can fail to emit the flush * after having emitted the batchbuffer command. Hence we need to fix @@ -2041,7 +2043,9 @@ int _i915_add_request(struct intel_ring_buffer *ring, request-seqno = intel_ring_get_seqno(ring); request-ring = ring; + request-head = request_start; request-tail = request_ring_position; + request-batch_obj = obj; request-ctx = ctx; if (request-ctx) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 757b52d..bd1750a 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -797,13 +797,14 @@ static void i915_gem_execbuffer_retire_commands(struct drm_device *dev, struct drm_file *file, struct intel_ring_buffer *ring, - struct i915_hw_context *ctx) + struct i915_hw_context *ctx, + struct drm_i915_gem_object *obj) { /* Unconditionally force add_request to emit a full flush. */ ring-gpu_caches_dirty = true; /* Add a breadcrumb for the completion of the batch buffer */ - (void)_i915_add_request(ring, NULL, file, ctx); + (void)_i915_add_request(ring, NULL, file, ctx, obj); } static int @@ -1078,7 +1079,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags); i915_gem_execbuffer_move_to_active(eb-objects, ring); - i915_gem_execbuffer_retire_commands(dev, file, ring, ctx); + i915_gem_execbuffer_retire_commands(dev, file, ring, ctx, batch_obj); err: eb_destroy(eb); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 06/16] drm/i915: introduce i915_hangcheck_ring_hung
In preparation to track per ring progress in hangcheck, add i915_hangcheck_ring_hung. v2: omit dev parameter (Ben Widawsky) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 29 + 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 0363871..7342a96 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1917,28 +1917,33 @@ static bool kick_ring(struct intel_ring_buffer *ring) return false; } +static bool i915_hangcheck_ring_hung(struct intel_ring_buffer *ring) +{ + if (IS_GEN2(ring-dev)) + return false; + + /* Is the chip hanging on a WAIT_FOR_EVENT? +* If so we can simply poke the RB_WAIT bit +* and break the hang. This should work on +* all but the second generation chipsets. +*/ + return !kick_ring(ring); +} + static bool i915_hangcheck_hung(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev-dev_private; if (dev_priv-gpu_error.hangcheck_count++ 1) { bool hung = true; + struct intel_ring_buffer *ring; + int i; DRM_ERROR(Hangcheck timer elapsed... GPU hung\n); i915_handle_error(dev, true); - if (!IS_GEN2(dev)) { - struct intel_ring_buffer *ring; - int i; - - /* Is the chip hanging on a WAIT_FOR_EVENT? -* If so we can simply poke the RB_WAIT bit -* and break the hang. This should work on -* all but the second generation chipsets. -*/ - for_each_ring(ring, dev_priv, i) - hung = !kick_ring(ring); - } + for_each_ring(ring, dev_priv, i) + hung = i915_hangcheck_ring_hung(ring); return hung; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 16/16] drm/i915: add i915_get_reset_stats_ioctl
This ioctl returns reset stats for specified context. The struct returned contains context loss counters. reset_count:all resets across all contexts batch_active: active batches lost on resets batch_pending: pending batches lost on resets v2: get rid of state tracking completely and deliver only counts. Idea from Chris Wilson. v3: fix commit message v4: default context handled inside i915_gem_contest_get_hang_stats v5: reset_count only for priviledged process v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com Cc: Ian Romanick i...@freedesktop.org Cc: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_dma.c |1 + drivers/gpu/drm/i915/i915_drv.c | 37 + drivers/gpu/drm/i915/i915_drv.h |2 ++ include/uapi/drm/i915_drm.h | 17 + 4 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 6693c7c..f4d7645 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1898,6 +1898,7 @@ struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED), DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED), + DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED), }; int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls); diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0928f11..861f396 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1302,3 +1302,40 @@ int i915_reg_read_ioctl(struct drm_device *dev, return 0; } + +int i915_get_reset_stats_ioctl(struct drm_device *dev, + void *data, struct drm_file *file) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + struct intel_ring_buffer *ring; + struct drm_i915_reset_stats *args = data; + struct i915_ctx_hang_stats *hs; + int ret; + + if (args-ctx_id == 0 !capable(CAP_SYS_ADMIN)) + return -EPERM; + + ret = mutex_lock_interruptible(dev-struct_mutex); + if (ret) + return ret; + + ring = dev_priv-ring[RCS]; + + hs = i915_gem_context_get_hang_stats(ring, file, args-ctx_id); + if (IS_ERR_OR_NULL(hs)) { + mutex_unlock(dev-struct_mutex); + return hs == NULL ? -ENODEV : PTR_ERR(hs); + } + + if (capable(CAP_SYS_ADMIN)) + args-reset_count = i915_reset_count(dev_priv-gpu_error); + else + args-reset_count = 0; + + args-batch_active = hs-batch_active; + args-batch_pending = hs-batch_pending; + + mutex_unlock(dev-struct_mutex); + + return 0; +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e682077..23064f4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1896,6 +1896,8 @@ extern int intel_enable_rc6(const struct drm_device *dev); extern bool i915_semaphore_is_enabled(struct drm_device *dev); int i915_reg_read_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); /* overlay */ #ifdef CONFIG_DEBUG_FS diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 07d5941..75cafc1 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -198,6 +198,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_SET_CACHING 0x2f #define DRM_I915_GEM_GET_CACHING 0x30 #define DRM_I915_REG_READ 0x31 +#define DRM_I915_GET_RESET_STATS 0x32 #define DRM_IOCTL_I915_INITDRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) #define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH) @@ -247,6 +248,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create) #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy) #define DRM_IOCTL_I915_REG_READDRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read) +#define DRM_IOCTL_I915_GET_RESET_STATS DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats) /* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -980,4 +982,19 @@ struct drm_i915_reg_read { __u64 offset;
[Intel-gfx] [PATCH v3 09/16] drm/i915: add struct i915_ctx_hang_stats
To count context losses, add struct i915_ctx_hang_stats for both i915_hw_context and drm_i915_file_private. drm_i915_file_private is used when there is no context. v2: renamed and cleaned up the struct (Chris Wilson, Ian Romanick) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_dma.c |2 +- drivers/gpu/drm/i915/i915_drv.h | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 4be58e3..6693c7c 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1789,7 +1789,7 @@ int i915_driver_open(struct drm_device *dev, struct drm_file *file) struct drm_i915_file_private *file_priv; DRM_DEBUG_DRIVER(\n); - file_priv = kmalloc(sizeof(*file_priv), GFP_KERNEL); + file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL); if (!file_priv) return -ENOMEM; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 5ccfc6c..9bbcf0c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -452,6 +452,13 @@ struct i915_hw_ppgtt { void (*cleanup)(struct i915_hw_ppgtt *ppgtt); }; +struct i915_ctx_hang_stats { + /* This context had batch pending when hang was declared */ + unsigned batch_pending; + + /* This context had batch active when hang was declared */ + unsigned batch_active; +}; /* This must match up with the value previously used for execbuf2.rsvd1. */ #define DEFAULT_CONTEXT_ID 0 @@ -462,6 +469,7 @@ struct i915_hw_context { struct drm_i915_file_private *file_priv; struct intel_ring_buffer *ring; struct drm_i915_gem_object *obj; + struct i915_ctx_hang_stats hang_stats; }; enum no_fbc_reason { @@ -1278,6 +1286,8 @@ struct drm_i915_file_private { struct list_head request_list; } mm; struct idr context_idr; + + struct i915_ctx_hang_stats hang_stats; }; #define INTEL_INFO(dev)(((struct drm_i915_private *) (dev)-dev_private)-info) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 02/16] drm/i915: cleanup i915_add_request
Only execbuffer needs all the parameters. Cleanup everything else behind macro. v2: _i915_add_request as function name (Chris Wilson) Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_drv.h|8 +--- drivers/gpu/drm/i915/i915_gem.c| 11 +-- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/intel_overlay.c |4 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c|2 +- 5 files changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 630982b..e5e2083 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1644,9 +1644,11 @@ void i915_gem_init_ppgtt(struct drm_device *dev); void i915_gem_cleanup_ringbuffer(struct drm_device *dev); int __must_check i915_gpu_idle(struct drm_device *dev); int __must_check i915_gem_idle(struct drm_device *dev); -int i915_add_request(struct intel_ring_buffer *ring, -struct drm_file *file, -u32 *seqno); +int _i915_add_request(struct intel_ring_buffer *ring, + u32 *seqno, + struct drm_file *file); +#define i915_add_request(ring, seqno) \ + _i915_add_request(ring, seqno, NULL) int __must_check i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index db804cc..d3ce0a7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -959,7 +959,7 @@ i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno) ret = 0; if (seqno == ring-outstanding_lazy_request) - ret = i915_add_request(ring, NULL, NULL); + ret = i915_add_request(ring, NULL); return ret; } @@ -1999,10 +1999,9 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) return 0; } -int -i915_add_request(struct intel_ring_buffer *ring, -struct drm_file *file, -u32 *out_seqno) +int _i915_add_request(struct intel_ring_buffer *ring, + u32 *out_seqno, + struct drm_file *file) { drm_i915_private_t *dev_priv = ring-dev-dev_private; struct drm_i915_gem_request *request; @@ -2267,7 +2266,7 @@ i915_gem_retire_work_handler(struct work_struct *work) idle = true; for_each_ring(ring, dev_priv, i) { if (ring-gpu_caches_dirty) - i915_add_request(ring, NULL, NULL); + i915_add_request(ring, NULL); idle = list_empty(ring-request_list); } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index e0686ca..b413962 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -802,7 +802,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev, ring-gpu_caches_dirty = true; /* Add a breadcrumb for the completion of the batch buffer */ - (void)i915_add_request(ring, file, NULL); + (void)_i915_add_request(ring, NULL, file); } static int diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 67a2501..40e509c 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -217,7 +217,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay, int ret; BUG_ON(overlay-last_flip_req); - ret = i915_add_request(ring, NULL, overlay-last_flip_req); + ret = i915_add_request(ring, overlay-last_flip_req); if (ret) return ret; @@ -286,7 +286,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay, intel_ring_emit(ring, flip_addr); intel_ring_advance(ring); - return i915_add_request(ring, NULL, overlay-last_flip_req); + return i915_add_request(ring, overlay-last_flip_req); } static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 1d5d613..9584bc1 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1423,7 +1423,7 @@ int intel_ring_idle(struct intel_ring_buffer *ring) /* We need to add any requests required to flush the objects and ring */ if (ring-outstanding_lazy_request) { - ret = i915_add_request(ring, NULL, NULL); + ret = i915_add_request(ring, NULL); if (ret) return ret; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org
[Intel-gfx] [PATCH] drm/i915: tune down Y tiling scanout warning
Userspace can easily hit this and does since Ville added a new evil igt testcase in: commit 069e35e0fc3785faa562adcfd2dd7bbed4cb1dea Author: Ville Syrjälä ville.syrj...@linux.intel.com Date: Mon Mar 4 15:34:06 2013 +0200 kms_flip: Add flip-vs-bad-tiling test Cc: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_display.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 1b2c744..7ac7de6 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -1907,8 +1907,10 @@ intel_pin_and_fence_fb_obj(struct drm_device *dev, alignment = 0; break; case I915_TILING_Y: - /* FIXME: Is this true? */ - DRM_ERROR(Y tiled not allowed for scan out buffers\n); + /* Despite that we check this in framebuffer init userspace can +* scre us over and change the tiling after the fact. Only +* pinned buffers can't change their tiling. */ + DRM_DEBUG_DRIVER(Y tiled not allowed for scan out buffers\n); return -EINVAL; default: BUG(); -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] GPU Hang in Intel Driver
Will try mesa 8.0 or newer and report back. I had already bumped up libdrm and xf86-video-intel to the latest available (2.4.43 and 2.21.5 respectively). Thanks, Dihan On Wed, Apr 3, 2013 at 12:21 PM, Daniel Vetter daniel.vet...@ffwll.chwrote: On Wed, Apr 3, 2013 at 5:58 PM, Dihan Wickremasuriya dwickremasur...@rethinkrobotics.com wrote: Thanks for the quick response and adding the mailing list. I did in fact attach the error_state in my original mail. Please let me know if it's not the one you were looking for (re-attached). Oops, missed that one. You have a gen7 ivb machine, so the bug you've reference is _definitely_ not yours - that one's for gen2. Now looking at your error state I've noticed that your mesa is using a massively outdated version of 3DSTATE_HIER_DEPTH_BUFFER, which has been fixed with 8.0, the broken version shipped with 7.11. Please upgrade to the latest released versions of mesa, libdrm, xf86-video-intel, your bug is very likely fixed already (since 8.0 is pretty old for gen7). Yours, Daniel Also added a subject to the thread. Best regards, Dihan On Wed, Apr 3, 2013 at 11:46 AM, Daniel Vetter daniel.vet...@ffwll.chwrote: Hi all, Two things: - Please _always_ include a public mailing list when reporting bugs, your dear maintainer sometimes slacks off. - We need to see the error_state before we can assess what kind of hang you have (it's like gettting a SIGSEGV for a normal program, no two gpu hangs are the same ...). Cheers, Daniel On Wed, Apr 3, 2013 at 5:42 PM, Dihan Wickremasuriya dwickremasur...@rethinkrobotics.com wrote: Hi Chris/Daniel, This is Dihan from Rethink Robotics and we were hoping you could help with the GPU hang problem in the i915 driver mentioned in bug #26345: https://bugs.freedesktop.org/show_bug.cgi?id=26345 We are running into the same problem with the 3.8.5 kernel (which has the fix mentioned in comment #153 of the bug report) when running a Qt 5 application in Gentoo. At times the entire X session would freeze. The x11-perf tests described in the bug report run without any issues though. Would you happen to know whether this is because of an issue in the driver that is not currently being addressed by the fix? I have attached the Xorg log, the dmesg output and i915_error_state from a hung session. Please let me know if you need any more info. Thanks in advance! Best regards, Dihan -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [v3.8 Regression] Merge tag 'v3.7-rc2' into drm-intel-next-queued
Hi Daniel, A bug was opened against the Ubuntu kernel[0]. After a kernel bisect, it was found the following was the first bad commit: commit c2fb7916927e989ea424e61ce5fe617e54878827 Merge: 29de6ce 6f0c058 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Mon Oct 22 14:34:51 2012 +0200 Merge tag 'v3.7-rc2' into drm-intel-next-queued The regression was introduced as of v3.8-rc1. However, further testing also shows this bug is now fixed in v3.9-rc4. I see that you are the author of this patch, so I wanted see if I can get your feedback. Do you happen to have an idea what may have fixed this in v3.9-rc4, so we can send a request to stable, if not already done? Otherwise, I can perform a reverse bisect to see which commit fixed this. Thanks, Joe [0] http://pad.lv/1109309 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [v3.8 Regression] Merge tag 'v3.7-rc2' into drm-intel-next-queued
On 04/03/2013 03:16 PM, Daniel Vetter wrote: On Wed, Apr 3, 2013 at 9:08 PM, Joseph Salisbury joseph.salisb...@canonical.com mailto:joseph.salisb...@canonical.com wrote: Hi Daniel, A bug was opened against the Ubuntu kernel[0]. After a kernel bisect, it was found the following was the first bad commit: commit c2fb7916927e989ea424e61ce5fe617e54878827 Merge: 29de6ce 6f0c058 Author: Daniel Vetter daniel.vet...@ffwll.ch mailto:daniel.vet...@ffwll.ch Date: Mon Oct 22 14:34:51 2012 +0200 Merge tag 'v3.7-rc2' into drm-intel-next-queued The regression was introduced as of v3.8-rc1. However, further testing also shows this bug is now fixed in v3.9-rc4. I see that you are the author of this patch, so I wanted see if I can get your feedback. Do you happen to have an idea what may have fixed this in v3.9-rc4, so we can send a request to stable, if not already done? Otherwise, I can perform a reverse bisect to see which commit fixed this. So apparently it's an oops somewhere in the nouveau setup code, which bisected to a backmerge which has _only_ conflicts in drm/i915 driver code. I have no idea what blew up here, sorry. -Daniel Thanks for the info, Daniel. I'll go the reverse bisect route. Thanks, Joe [0] http://pad.lv/1109309 -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: tune down Y tiling scanout warning
On Thu, Apr 04, 2013 at 06:13:08PM +0200, Daniel Vetter wrote: Userspace can easily hit this and does since Ville added a new evil igt testcase in: Scre? :-p -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 12/13] drm/i915: fix DP get_hw_state return value
On Thu, 4 Apr 2013 09:56:07 +0200 Daniel Vetter dan...@ffwll.ch wrote: On Thu, Apr 4, 2013 at 2:12 AM, Jesse Barnes jbar...@virtuousgeek.org wrote: On Thu, 4 Apr 2013 01:15:28 +0200 Daniel Vetter dan...@ffwll.ch wrote: On Tue, Apr 02, 2013 at 08:11:05PM +0200, Daniel Vetter wrote: On Tue, Apr 02, 2013 at 10:03:56AM -0700, Jesse Barnes wrote: If we couldn't find a pipe we shouldn't return true. This might be even better as a WARN though, since it should be impossible to have the port enabled without a pipe selected. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org These two fixes are merged for -next, thanks. Actually this one here is broken, so I've had to revert it. What failed? How is it possible we'd have a DP port without a pipe? Every pattern in the register field should correspond to a pipe right? Review failed on my side - you've changed the return which is used by all the success cases ... There's another return for one failure case, and the no-pipe one just falls through. The only case this patch did _not_ break is pch ports on cpt/ppt. Testing fail on my part; I was testing PCH ports but not DP with the config I had. Jesse ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/4] drm: Add struct drm_rect and assorted utility functions
Hi Ville, Thanks for the patch. On Wednesday 27 March 2013 17:46:22 ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com struct drm_rect represents a simple rectangle. The utility functions are there to help driver writers. v2: Moved the region stuff into its own file, made the smaller funcs static inline, used 64bit maths in the scaled clipping function to avoid overflows (instead it will saturate to INT_MIN or INT_MAX). v3: Renamed drm_region to drm_rect, drm_region_clip to drm_rect_intersect, and drm_region_subsample to drm_rect_downscale. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/Makefile | 3 +- drivers/gpu/drm/drm_rect.c | 96 + include/drm/drm_rect.h | 132 ++ 3 files changed, 230 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_rect.c create mode 100644 include/drm/drm_rect.h [snip] diff --git a/drivers/gpu/drm/drm_rect.c b/drivers/gpu/drm/drm_rect.c new file mode 100644 index 000..1ad4f5e --- /dev/null +++ b/drivers/gpu/drm/drm_rect.c @@ -0,0 +1,96 @@ [snip] +#include linux/errno.h +#include linux/export.h +#include linux/kernel.h +#include drm/drm_rect.h + +/** + * drm_rect_intersect - intersect two rectangles + * @r1: first rectangle + * @r2: second rectangle + * + * Calculate the intersection of rectangles @r1 and @r2. + * @r1 will be overwritten with the intersection. + * + * RETURNS: + * @true if rectangle @r1 is still visible after the operation, + * @false otherwise. Isn't @ used for function parameters only ? + */ +bool drm_rect_intersect(struct drm_rect *r1, const struct drm_rect *r2) +{ + r1-x1 = max(r1-x1, r2-x1); + r1-y1 = max(r1-y1, r2-y1); + r1-x2 = min(r1-x2, r2-x2); + r1-y2 = min(r1-y2, r2-y2); + + return drm_rect_visible(r1); Do callers always need that information, or should they instead call drm_rect_visible() explicitly when they need it ? +} +EXPORT_SYMBOL(drm_rect_intersect); + +/** + * drm_rect_clip_scaled - perform a scaled clip operation + * @src: source window rectangle + * @dst: destination window rectangle + * @clip: clip rectangle + * @hscale: horizontal scaling factor + * @vscale: vertical scaling factor + * + * Clip rectangle @dst by rectangle @clip. Clip rectangle @src by the + * same amounts multiplied by @hscale and @vscale. + * + * RETUTRNS: + * @true if rectangle @dst is still visible after being clipped, + * @false otherwise + */ +bool drm_rect_clip_scaled(struct drm_rect *src, struct drm_rect *dst, + const struct drm_rect *clip, + int hscale, int vscale) +{ + int diff; + + diff = clip-x1 - dst-x1; + if (diff 0) { + int64_t tmp = src-x1 + (int64_t) diff * hscale; + src-x1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = clip-y1 - dst-y1; + if (diff 0) { + int64_t tmp = src-y1 + (int64_t) diff * vscale; + src-y1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-x2 - clip-x2; + if (diff 0) { + int64_t tmp = src-x2 - (int64_t) diff * hscale; + src-x2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-y2 - clip-y2; + if (diff 0) { + int64_t tmp = src-y2 - (int64_t) diff * vscale; + src-y2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + + return drm_rect_intersect(dst, clip); +} +EXPORT_SYMBOL(drm_rect_clip_scaled); diff --git a/include/drm/drm_rect.h b/include/drm/drm_rect.h new file mode 100644 index 000..40b09a4 --- /dev/null +++ b/include/drm/drm_rect.h @@ -0,0 +1,132 @@ [snip] +/** + * drm_rect - two dimensional rect + * @x1: horizontal starting coordinate (inclusive) + * @x2: horizontal ending coordinate (exclusive) + * @y1: vertical starting coordinate (inclusive) + * @y2: vertical ending coordinate (exclusive) What's the rationale for making x2 and y2 exclusive ? + */ +struct drm_rect { + int x1, y1, x2, y2; +}; + +/** + * drm_rect_adjust_size - adjust the size of the rect + * @r: rect to be adjusted + * @x: horizontal adjustment + * @y: vertical adjustment What about renaming x and y to dx and dy ? It would make it more explicit that the adjustements are incremental and not absolute values. + * Change the size of rect @r by @x in the horizontal direction, + * and by @y in the vertical direction, while keeping the center + * of @r stationary. + * + * Positive @x and @y increase the size, negative values decrease it. + */ +static inline void drm_rect_adjust_size(struct drm_rect *r, int x, int y) +{ + r-x1 -= x 1; + r-y1 -= y 1; + r-x2 += (x + 1) 1; + r-y2 += (y + 1) 1; +} + +/** + * drm_rect_translate - translate the rect + * @r: rect to be
Re: [Intel-gfx] [PATCH] drm/i915: revert eDP bpp clamping code changes
Hi 2013/4/4 Daniel Vetter daniel.vet...@ffwll.ch The behaviour around handling the eDP bpp value from vbt has been slightly changed in commit 3600836585e3fdef0a1410d63fe5ce4015007aac Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:59 2013 +0100 drm/i915: convert DP autodither code to new infrastructure The old behaviour was that we used the plane's bpp (usually 24bpp) for computing the dp link bw, but set up the pipe with the bpp value from vbt if available. The takes the vbt bpp override into account even for s/The/This/ ? the dp link bw configuration. On Paulo's hsw machine this resulted in a slower link clock and a black screen - but the mode actually /should/ fit even with the lower clock. Until we've cleared up simply stay bug-for-bug compatible with the old code. While at it, also restore a debug message lost in: commit 4e53c2e010e531b4a014692199e978482d471c7e Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:58 2013 +0100 drm/i915: precompute pipe bpp before touching the hw Cc: Paulo Zanoni przan...@gmail.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_dp.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index c5cfec3..658d071 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -741,9 +741,6 @@ intel_dp_compute_config(struct intel_encoder *encoder, /* Walk through all bpp values. Luckily they're all nicely spaced with 2 * bpc in between. */ bpp = min_t(int, 8*3, pipe_config-pipe_bpp); - if (is_edp(intel_dp) dev_priv-edp.bpp) - bpp = min_t(int, bpp, dev_priv-edp.bpp); - for (; bpp = 6*3; bpp -= 2*3) { mode_rate = intel_dp_link_required(target_clock, bpp); @@ -781,7 +778,6 @@ found: intel_dp-link_bw = bws[clock]; intel_dp-lane_count = lane_count; adjusted_mode-clock = drm_dp_bw_code_to_link_rate(intel_dp-link_bw); - pipe_config-pipe_bpp = bpp; pipe_config-pixel_target_clock = target_clock; DRM_DEBUG_KMS(DP link bw %02x lane count %d clock %d bpp %d\n, @@ -796,6 +792,20 @@ found: intel_dp_set_clock(encoder, pipe_config, intel_dp-link_bw); + /* +* XXX: We have a strange regression where using the vbt edp bpp value +* for the link bw computation results in black screens, the panel only +* works when we do the computation at the usual 24bpp (but still +* requires us to use 18bpp. Until that's fully debugged, stay We're missing a ')' character. Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com Tested-by: Paulo Zanoni paulo.r.zan...@intel.com +* bug-for-bug compatible with the old code. +*/ + if (is_edp(intel_dp) dev_priv-edp.bpp) { + DRM_DEBUG_KMS(clamping display bpc (was %d) to eDP (%d)\n, + bpp, dev_priv-edp.bpp); + bpp = min_t(int, bpp, dev_priv-edp.bpp); + } + pipe_config-pipe_bpp = bpp; + return true; } -- 1.7.10.4 -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: revert eDP bpp clamping code changes
On Thu, Apr 04, 2013 at 03:44:01PM -0300, Paulo Zanoni wrote: Hi 2013/4/4 Daniel Vetter daniel.vet...@ffwll.ch The behaviour around handling the eDP bpp value from vbt has been slightly changed in commit 3600836585e3fdef0a1410d63fe5ce4015007aac Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:59 2013 +0100 drm/i915: convert DP autodither code to new infrastructure The old behaviour was that we used the plane's bpp (usually 24bpp) for computing the dp link bw, but set up the pipe with the bpp value from vbt if available. The takes the vbt bpp override into account even for s/The/This/ ? the dp link bw configuration. On Paulo's hsw machine this resulted in a slower link clock and a black screen - but the mode actually /should/ fit even with the lower clock. Until we've cleared up simply stay bug-for-bug compatible with the old code. While at it, also restore a debug message lost in: commit 4e53c2e010e531b4a014692199e978482d471c7e Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Mar 27 00:44:58 2013 +0100 drm/i915: precompute pipe bpp before touching the hw Cc: Paulo Zanoni przan...@gmail.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_dp.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index c5cfec3..658d071 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -741,9 +741,6 @@ intel_dp_compute_config(struct intel_encoder *encoder, /* Walk through all bpp values. Luckily they're all nicely spaced with 2 * bpc in between. */ bpp = min_t(int, 8*3, pipe_config-pipe_bpp); - if (is_edp(intel_dp) dev_priv-edp.bpp) - bpp = min_t(int, bpp, dev_priv-edp.bpp); - for (; bpp = 6*3; bpp -= 2*3) { mode_rate = intel_dp_link_required(target_clock, bpp); @@ -781,7 +778,6 @@ found: intel_dp-link_bw = bws[clock]; intel_dp-lane_count = lane_count; adjusted_mode-clock = drm_dp_bw_code_to_link_rate(intel_dp-link_bw); - pipe_config-pipe_bpp = bpp; pipe_config-pixel_target_clock = target_clock; DRM_DEBUG_KMS(DP link bw %02x lane count %d clock %d bpp %d\n, @@ -796,6 +792,20 @@ found: intel_dp_set_clock(encoder, pipe_config, intel_dp-link_bw); + /* +* XXX: We have a strange regression where using the vbt edp bpp value +* for the link bw computation results in black screens, the panel only +* works when we do the computation at the usual 24bpp (but still +* requires us to use 18bpp. Until that's fully debugged, stay We're missing a ')' character. Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com Tested-by: Paulo Zanoni paulo.r.zan...@intel.com Fixed up and merged to dinq, thanks for digging into this an reviewing the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/4] drm: Add struct drm_rect and assorted utility functions
On Thu, Apr 04, 2013 at 06:38:15PM +0200, Laurent Pinchart wrote: Hi Ville, Thanks for the patch. On Wednesday 27 March 2013 17:46:22 ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com struct drm_rect represents a simple rectangle. The utility functions are there to help driver writers. v2: Moved the region stuff into its own file, made the smaller funcs static inline, used 64bit maths in the scaled clipping function to avoid overflows (instead it will saturate to INT_MIN or INT_MAX). v3: Renamed drm_region to drm_rect, drm_region_clip to drm_rect_intersect, and drm_region_subsample to drm_rect_downscale. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/Makefile | 3 +- drivers/gpu/drm/drm_rect.c | 96 + include/drm/drm_rect.h | 132 ++ 3 files changed, 230 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_rect.c create mode 100644 include/drm/drm_rect.h [snip] diff --git a/drivers/gpu/drm/drm_rect.c b/drivers/gpu/drm/drm_rect.c new file mode 100644 index 000..1ad4f5e --- /dev/null +++ b/drivers/gpu/drm/drm_rect.c @@ -0,0 +1,96 @@ [snip] +#include linux/errno.h +#include linux/export.h +#include linux/kernel.h +#include drm/drm_rect.h + +/** + * drm_rect_intersect - intersect two rectangles + * @r1: first rectangle + * @r2: second rectangle + * + * Calculate the intersection of rectangles @r1 and @r2. + * @r1 will be overwritten with the intersection. + * + * RETURNS: + * @true if rectangle @r1 is still visible after the operation, + * @false otherwise. Isn't @ used for function parameters only ? Not sure. It's been a while since I wrote these, and I guess I thought that the @ was there just for higlighting purposes. Looks like the documentation for the documentation system isn't that great :) so I guess I'll need to try building the docs and see what happens. + */ +bool drm_rect_intersect(struct drm_rect *r1, const struct drm_rect *r2) +{ + r1-x1 = max(r1-x1, r2-x1); + r1-y1 = max(r1-y1, r2-y1); + r1-x2 = min(r1-x2, r2-x2); + r1-y2 = min(r1-y2, r2-y2); + + return drm_rect_visible(r1); Do callers always need that information, or should they instead call drm_rect_visible() explicitly when they need it ? I suppose someone might call it w/o checking the visibility right away. In my current use case I do use the return value, so it saves one line of code :) But I don't mind changing it if you think that would be better w/o the implicit drm_rect_visible() call. +} +EXPORT_SYMBOL(drm_rect_intersect); + +/** + * drm_rect_clip_scaled - perform a scaled clip operation + * @src: source window rectangle + * @dst: destination window rectangle + * @clip: clip rectangle + * @hscale: horizontal scaling factor + * @vscale: vertical scaling factor + * + * Clip rectangle @dst by rectangle @clip. Clip rectangle @src by the + * same amounts multiplied by @hscale and @vscale. + * + * RETUTRNS: + * @true if rectangle @dst is still visible after being clipped, + * @false otherwise + */ +bool drm_rect_clip_scaled(struct drm_rect *src, struct drm_rect *dst, + const struct drm_rect *clip, + int hscale, int vscale) +{ + int diff; + + diff = clip-x1 - dst-x1; + if (diff 0) { + int64_t tmp = src-x1 + (int64_t) diff * hscale; + src-x1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = clip-y1 - dst-y1; + if (diff 0) { + int64_t tmp = src-y1 + (int64_t) diff * vscale; + src-y1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-x2 - clip-x2; + if (diff 0) { + int64_t tmp = src-x2 - (int64_t) diff * hscale; + src-x2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-y2 - clip-y2; + if (diff 0) { + int64_t tmp = src-y2 - (int64_t) diff * vscale; + src-y2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + + return drm_rect_intersect(dst, clip); +} +EXPORT_SYMBOL(drm_rect_clip_scaled); diff --git a/include/drm/drm_rect.h b/include/drm/drm_rect.h new file mode 100644 index 000..40b09a4 --- /dev/null +++ b/include/drm/drm_rect.h @@ -0,0 +1,132 @@ [snip] +/** + * drm_rect - two dimensional rect + * @x1: horizontal starting coordinate (inclusive) + * @x2: horizontal ending coordinate (exclusive) + * @y1: vertical starting coordinate (inclusive) + * @y2: vertical ending coordinate (exclusive) What's the rationale for making x2 and y2 exclusive ? I think exclusive makes things easier. You don't have to add/subtract 1 when going between x1/x2 and x/w representations. Based on some experience, it's surprisingly easy to
[Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
In order to fully serialize access to the fenced region and the update to the fence register we need to take extreme measures on SNB+, and write the fence from each cpu taking care to serialise memory accesses on each. The usual mb(), or even a mb() on each CPU is not enough to ensure that access to the fenced region is coherent across the change in fence register. Fixes i-g-t/gem_fence_thrash v2: Bring a bigger gun v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk CC: Jon Bloomfield jon.bloomfi...@intel.com Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 42 +++ 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fa4ea1a..13e42f6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2689,17 +2689,51 @@ static inline int fence_number(struct drm_i915_private *dev_priv, return fence - dev_priv-fence_regs; } +struct write_fence { + struct drm_device *dev; + struct drm_i915_gem_object *obj; + int fence; +}; + +static void i915_gem_write_fence__ipi(void *data) +{ + struct write_fence *args = data; + + wbinvd(); + + i915_gem_write_fence(args-dev, args-fence, args-obj); +} + static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, struct drm_i915_fence_reg *fence, bool enable) { struct drm_i915_private *dev_priv = obj-base.dev-dev_private; - int reg = fence_number(dev_priv, fence); - - i915_gem_write_fence(obj-base.dev, reg, enable ? obj : NULL); + struct write_fence args = { + .dev = obj-base.dev, + .fence = fence_number(dev_priv, fence), + .obj = enable ? obj : NULL, + }; + + /* In order to fully serialize access to the fenced region and +* the update to the fence register we need to take extreme +* measures on SNB+, and write the fence from each cpu taking +* care to serialise memory accesses on each. The usual mb(), +* or even a mb() on each CPU is not enough to ensure that access +* to the fenced region is coherent across the change in fence +* register. +* +* As it turns out for IVB, I need slightly heavier bullets and +* need to do a wbinvd() per-processor to serialise memory acceses +* with the fence update. +*/ + if (HAS_LLC(obj-base.dev)) + on_each_cpu(i915_gem_write_fence__ipi, args, 1); + else + i915_gem_write_fence__ipi(args); if (enable) { - obj-fence_reg = reg; + obj-fence_reg = args.fence; fence-obj = obj; list_move_tail(fence-lru_list, dev_priv-mm.fence_list); } else { -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
In order to fully serialize access to the fenced region and the update to the fence register we need to take extreme measures on SNB+, and write the fence from each cpu taking care to serialise memory accesses on each. The usual mb(), or even a mb() on each CPU is not enough to ensure that access to the fenced region is coherent across the change in fence register. Fixes i-g-t/gem_fence_thrash v2: Bring a bigger gun v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) v4: Remove changes for working generations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk CC: Jon Bloomfield jon.bloomfi...@intel.com Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 42 +++ 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fa4ea1a..02b3a61 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2689,17 +2689,51 @@ static inline int fence_number(struct drm_i915_private *dev_priv, return fence - dev_priv-fence_regs; } +struct write_fence { + struct drm_device *dev; + struct drm_i915_gem_object *obj; + int fence; +}; + +static void i915_gem_write_fence__ipi(void *data) +{ + struct write_fence *args = data; + + wbinvd(); + + i915_gem_write_fence(args-dev, args-fence, args-obj); +} + static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, struct drm_i915_fence_reg *fence, bool enable) { struct drm_i915_private *dev_priv = obj-base.dev-dev_private; - int reg = fence_number(dev_priv, fence); - - i915_gem_write_fence(obj-base.dev, reg, enable ? obj : NULL); + struct write_fence args = { + .dev = obj-base.dev, + .fence = fence_number(dev_priv, fence), + .obj = enable ? obj : NULL, + }; + + /* In order to fully serialize access to the fenced region and +* the update to the fence register we need to take extreme +* measures on SNB+, and write the fence from each cpu taking +* care to serialise memory accesses on each. The usual mb(), +* or even a mb() on each CPU is not enough to ensure that access +* to the fenced region is coherent across the change in fence +* register. +* +* As it turns out for IVB, I need slightly heavier bullets and +* need to do a wbinvd() per-processor to serialise memory acceses +* with the fence update. +*/ + if (HAS_LLC(obj-base.dev)) + on_each_cpu(i915_gem_write_fence__ipi, args, 1); + else + i915_gem_write_fence(args.dev, args.fence, args.obj); if (enable) { - obj-fence_reg = reg; + obj-fence_reg = args.fence; fence-obj = obj; list_move_tail(fence-lru_list, dev_priv-mm.fence_list); } else { -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] drm/i915: fix lost FP_CB_TUNE setting for pch plls
commit de13a2e3f88a4da8e85063b6de37096795079e41 Author: Paulo Zanoni paulo.r.zan...@intel.com Date: Thu Sep 20 18:36:05 2012 -0300 drm/i915: extract compute_dpll from ironlake_crtc_mode_set missed the subtle adjustment of the FP1 register. Fix this up by passing a pointer around instead of the value. Cc: Paulo Zanoni paulo.r.zan...@intel.com Cc: Rodrigo Vivi rodrigo.v...@gmail.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_display.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 1b2c744..f623f97 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5374,7 +5374,7 @@ static void ironlake_fdi_set_m_n(struct drm_crtc *crtc) } static uint32_t ironlake_compute_dpll(struct intel_crtc *intel_crtc, - intel_clock_t *clock, u32 fp) + intel_clock_t *clock, u32 *fp) { struct drm_crtc *crtc = intel_crtc-base; struct drm_device *dev = crtc-dev; @@ -5414,7 +5414,7 @@ static uint32_t ironlake_compute_dpll(struct intel_crtc *intel_crtc, factor = 20; if (clock-m factor * clock-n) - fp |= FP_CB_TUNE; + *fp |= FP_CB_TUNE; dpll = 0; @@ -5531,7 +5531,7 @@ static int ironlake_crtc_mode_set(struct drm_crtc *crtc, fp2 = reduced_clock.n 16 | reduced_clock.m1 8 | reduced_clock.m2; - dpll = ironlake_compute_dpll(intel_crtc, clock, fp); + dpll = ironlake_compute_dpll(intel_crtc, clock, fp); DRM_DEBUG_KMS(Mode for pipe %d:\n, pipe); drm_mode_debug_printmodeline(mode); -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] drm/i915: fix FP CB tuning limits for lvds
Only on IBX should we set the limiting factor to 25 unconditionally for dual-channel mode, on CPT/PPT 25 only applies when the lvds refclock is 100MHz. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_display.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index f623f97..cb3e23f 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5408,7 +5408,7 @@ static uint32_t ironlake_compute_dpll(struct intel_crtc *intel_crtc, if (is_lvds) { if ((intel_panel_use_ssc(dev_priv) dev_priv-lvds_ssc_freq == 100) || - intel_is_dual_link_lvds(dev)) + (HAS_PCH_IBX(dev) intel_is_dual_link_lvds(dev))) factor = 25; } else if (is_sdvo is_tv) factor = 20; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] drm/i915: set CB tuning also for the reduce clock
Since the ratio is different, we also need to pass in the parameters for the reduced clock. Might or might not reduce flicker for the auto-downclocking on lvds/eDP. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/intel_display.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index cb3e23f..c7c135c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5374,7 +5374,8 @@ static void ironlake_fdi_set_m_n(struct drm_crtc *crtc) } static uint32_t ironlake_compute_dpll(struct intel_crtc *intel_crtc, - intel_clock_t *clock, u32 *fp) + intel_clock_t *clock, u32 *fp, + intel_clock_t *reduced_clock, u32 *fp2) { struct drm_crtc *crtc = intel_crtc-base; struct drm_device *dev = crtc-dev; @@ -5416,6 +5417,9 @@ static uint32_t ironlake_compute_dpll(struct intel_crtc *intel_crtc, if (clock-m factor * clock-n) *fp |= FP_CB_TUNE; + if (fp2 (reduced_clock-m factor * reduced_clock-n)) + *fp2 |= FP_CB_TUNE; + dpll = 0; if (is_lvds) @@ -5531,7 +5535,8 @@ static int ironlake_crtc_mode_set(struct drm_crtc *crtc, fp2 = reduced_clock.n 16 | reduced_clock.m1 8 | reduced_clock.m2; - dpll = ironlake_compute_dpll(intel_crtc, clock, fp); + dpll = ironlake_compute_dpll(intel_crtc, clock, fp, reduced_clock, +has_reduced_clock ? fp2 : NULL); DRM_DEBUG_KMS(Mode for pipe %d:\n, pipe); drm_mode_debug_printmodeline(mode); -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix SDVO connector and encoder get_hw_state functions
From: Egbert Eich e...@suse.de The connector associated with the encoder is considered active when the output associtated with this connector is active on the encoder. The encoder itself is considered active when either there is an active output on it or the respective SDVO channel is active. Having active outputs when the SDVO channel is inactive seems to be inconsistent: such states can be found when intel_modeset_setup_hw_state() collects the hardware state set by the BIOS. This inconsistency will be fixed in intel_sanitize_crtc() (when intel_crtc_update_dpms() is called), this however only happens when the encoder is associated with a crtc. This patch also reverts: commit bd6946e87a98fea11907b2a47368e13044458a35 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Tue Apr 2 21:30:34 2013 +0200 drm/i915: Fix sdvo connector get_hw_state function Signed-off-by: Egbert Eich e...@suse.de Suggested-By: Daniel Vetter daniel.vet...@ffwll.ch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63031 --- drivers/gpu/drm/i915/intel_sdvo.c |9 +++-- 1 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_sdvo.c b/drivers/gpu/drm/i915/intel_sdvo.c index 298dc85..f6a9f4a 100644 --- a/drivers/gpu/drm/i915/intel_sdvo.c +++ b/drivers/gpu/drm/i915/intel_sdvo.c @@ -1231,12 +1231,8 @@ static bool intel_sdvo_connector_get_hw_state(struct intel_connector *connector) struct intel_sdvo_connector *intel_sdvo_connector = to_intel_sdvo_connector(connector-base); struct intel_sdvo *intel_sdvo = intel_attached_sdvo(connector-base); - struct drm_i915_private *dev_priv = intel_sdvo-base.base.dev-dev_private; u16 active_outputs; - if (!(I915_READ(intel_sdvo-sdvo_reg) SDVO_ENABLE)) - return false; - intel_sdvo_get_active_outputs(intel_sdvo, active_outputs); if (active_outputs intel_sdvo_connector-output_flag) @@ -1251,11 +1247,13 @@ static bool intel_sdvo_get_hw_state(struct intel_encoder *encoder, struct drm_device *dev = encoder-base.dev; struct drm_i915_private *dev_priv = dev-dev_private; struct intel_sdvo *intel_sdvo = to_intel_sdvo(encoder-base); + u16 active_outputs; u32 tmp; tmp = I915_READ(intel_sdvo-sdvo_reg); + intel_sdvo_get_active_outputs(intel_sdvo, active_outputs); - if (!(tmp SDVO_ENABLE)) + if (!(tmp SDVO_ENABLE) (active_outputs == 0)) return false; if (HAS_PCH_CPT(dev)) @@ -2746,7 +2744,6 @@ bool intel_sdvo_init(struct drm_device *dev, uint32_t sdvo_reg, bool is_sdvob) struct intel_sdvo *intel_sdvo; u32 hotplug_mask; int i; - intel_sdvo = kzalloc(sizeof(struct intel_sdvo), GFP_KERNEL); if (!intel_sdvo) return false; -- 1.7.7 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
In order to fully serialize access to the fenced region and the update to the fence register we need to take extreme measures on SNB+, and write the fence from each cpu taking care to serialise memory accesses on each. The usual mb(), or even a mb() on each CPU is not enough to ensure that access to the fenced region is coherent across the change in fence register - however a full blown write-back invalidate (wbinvd) per processor is sufficient. Fixes i-g-t/gem_fence_thrash v2: Bring a bigger gun v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) v4: Remove changes for working generations. v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Jon Bloomfield jon.bloomfi...@intel.com Tested-by: Jon Bloomfield jon.bloomfi...@intel.com (v2) Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 26 +- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fa4ea1a..632a050 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2689,17 +2689,33 @@ static inline int fence_number(struct drm_i915_private *dev_priv, return fence - dev_priv-fence_regs; } +static void i915_gem_write_fence__ipi(void *data) +{ + wbinvd(); +} + static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, struct drm_i915_fence_reg *fence, bool enable) { - struct drm_i915_private *dev_priv = obj-base.dev-dev_private; - int reg = fence_number(dev_priv, fence); - - i915_gem_write_fence(obj-base.dev, reg, enable ? obj : NULL); + struct drm_device *dev = obj-base.dev; + struct drm_i915_private *dev_priv = dev-dev_private; + int fence_reg = fence_number(dev_priv, fence); + + /* In order to fully serialize access to the fenced region and +* the update to the fence register we need to take extreme +* measures on SNB+, and write the fence from each cpu taking +* care to serialise memory accesses on each. The usual mb(), +* or even a mb() on each CPU is not enough to ensure that access +* to the fenced region is coherent across the change in fence +* register, but a wbinvd() per processor is sufficient. +*/ + if (HAS_LLC(obj-base.dev)) + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1); + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL); if (enable) { - obj-fence_reg = reg; + obj-fence_reg = fence_reg; fence-obj = obj; list_move_tail(fence-lru_list, dev_priv-mm.fence_list); } else { -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/7] drm/i915: Mark context switch likely
Only the very first switch doesn't take the path. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 94d873a..aa080ea 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -390,7 +390,7 @@ static int do_switch(struct i915_hw_context *to) * is a bit suboptimal because the retiring can occur simply after the * MI_SET_CONTEXT instead of when the next seqno has completed. */ - if (from_obj != NULL) { + if (likely(from_obj)) { from_obj-base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; i915_gem_object_move_to_active(from_obj, ring); /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/7] drm/i915: Move context special case to get()
This allows us to make upcoming refcounting code a bit simpler, and cleaner. In addition, I think it makes the interface a bit nicer if the caller doesn't need to figure out default contexts and such. The interface works very similarly to the gem object ref counting, and I believe it makes sense to do so as we'll use it in a very similar way to objects (we currently use objects as a somewhat hackish way to manage context lifecycles). Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_context.c | 34 +++-- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index aa080ea..6211637 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -95,8 +95,6 @@ */ #define CONTEXT_ALIGN (6410) -static struct i915_hw_context * -i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id); static int do_switch(struct i915_hw_context *to); static int get_context_size(struct drm_device *dev) @@ -291,9 +289,18 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file) } static struct i915_hw_context * -i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) +i915_gem_context_get(struct intel_ring_buffer *ring, +struct drm_i915_file_private *file_priv, u32 id) { - return (struct i915_hw_context *)idr_find(file_priv-context_idr, id); + struct i915_hw_context *ctx; + + if (id == DEFAULT_CONTEXT_ID) + ctx = ring-default_context; + else + ctx = (struct i915_hw_context *) + idr_find(file_priv-context_idr, id); + + return ctx; } static inline int @@ -440,16 +447,12 @@ int i915_switch_context(struct intel_ring_buffer *ring, if (ring != dev_priv-ring[RCS]) return 0; - if (to_id == DEFAULT_CONTEXT_ID) { - to = ring-default_context; - } else { - if (file == NULL) - return -EINVAL; + if (file == NULL) + return -EINVAL; - to = i915_gem_context_get(file-driver_priv, to_id); - if (to == NULL) - return -ENOENT; - } + to = i915_gem_context_get(ring, file-driver_priv, to_id); + if (to == NULL) + return -ENOENT; return do_switch(to); } @@ -495,11 +498,14 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, if (!(dev-driver-driver_features DRIVER_GEM)) return -ENODEV; + if (args-ctx_id == DEFAULT_CONTEXT_ID) + return -ENOENT; + ret = i915_mutex_lock_interruptible(dev); if (ret) return ret; - ctx = i915_gem_context_get(file_priv, args-ctx_id); + ctx = i915_gem_context_get(NULL, file_priv, args-ctx_id); if (!ctx) { mutex_unlock(dev-struct_mutex); return -ENOENT; -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915: Better context messages
Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_context.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8e218ad..8b2e73a 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -213,7 +213,7 @@ static int create_default_context(struct drm_i915_private *dev_priv) if (ret) goto err_unpin; - DRM_DEBUG_DRIVER(Default HW context loaded\n); + DRM_DEBUG_DRIVER(Default HW context loaded (%p)\n, ctx); return 0; err_unpin: @@ -275,6 +275,7 @@ static int context_idr_cleanup(int id, void *p, void *data) BUG_ON(id == DEFAULT_CONTEXT_ID); + DRM_DEBUG_DRIVER(Context %d closed before destroy.\n, ctx-id); do_destroy(ctx); return 0; @@ -453,8 +454,11 @@ int i915_switch_context(struct intel_ring_buffer *ring, return -EINVAL; to = i915_gem_context_get(ring, file-driver_priv, to_id); - if (to == NULL) + if (unlikely(!to)) { + BUG_ON(to_id == DEFAULT_CONTEXT_ID); + DRM_DEBUG_DRIVER(Couldn't find context %d\n, to_id); return -ENOENT; + } return do_switch(to); } @@ -517,6 +521,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, mutex_unlock(dev-struct_mutex); - DRM_DEBUG_DRIVER(HW context %d destroyed\n, args-ctx_id); + DRM_DEBUG_DRIVER(User destroyed context %d\n, args-ctx_id); return 0; } -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915: A bit better messaging for contexts
Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_context.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8e218ad..5ac93f4 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -97,6 +97,11 @@ static int do_switch(struct i915_hw_context *to); +static inline bool is_default_context(struct i915_hw_context *ctx) +{ + return (ctx == ctx-ring-default_context); +} + static int get_context_size(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -124,8 +129,10 @@ static int get_context_size(struct drm_device *dev) static void do_destroy(struct i915_hw_context *ctx) { - if (ctx-file_priv) + if (ctx-file_priv) { + WARN_ON(!is_default_context(ctx)); idr_remove(ctx-file_priv-context_idr, ctx-id); + } drm_gem_object_unreference(ctx-obj-base); kfree(ctx); @@ -177,11 +184,6 @@ err_out: return ERR_PTR(ret); } -static inline bool is_default_context(struct i915_hw_context *ctx) -{ - return (ctx == ctx-ring-default_context); -} - /** * The default context needs to exist per ring that uses contexts. It stores the * context state of the GPU for applications that don't utilize HW contexts, as @@ -213,7 +215,7 @@ static int create_default_context(struct drm_i915_private *dev_priv) if (ret) goto err_unpin; - DRM_DEBUG_DRIVER(Default HW context loaded\n); + DRM_DEBUG_DRIVER(Default HW context loaded (%p)\n, ctx); return 0; err_unpin: @@ -275,6 +277,7 @@ static int context_idr_cleanup(int id, void *p, void *data) BUG_ON(id == DEFAULT_CONTEXT_ID); + DRM_DEBUG_DRIVER(Context %d closed before destroy.\n, ctx-id); do_destroy(ctx); return 0; @@ -453,8 +456,11 @@ int i915_switch_context(struct intel_ring_buffer *ring, return -EINVAL; to = i915_gem_context_get(ring, file-driver_priv, to_id); - if (to == NULL) + if (unlikely(!to)) { + BUG_ON(to_id == DEFAULT_CONTEXT_ID); + DRM_DEBUG_DRIVER(Couldn't find context %d\n, to_id); return -ENOENT; + } return do_switch(to); } @@ -517,6 +523,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, mutex_unlock(dev-struct_mutex); - DRM_DEBUG_DRIVER(HW context %d destroyed\n, args-ctx_id); + DRM_DEBUG_DRIVER(User destroyed context %d\n, args-ctx_id); return 0; } -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/7] drm/i915: Make object aware that it backs a context
It's really simple to keep track of objects which back the context. Doing so allows will be really helpful in properly refcounting contexts Daniel: please see the last patch in the series before commenting on this patch. I'm open to other ideas, but this seems like the simplest way to do it, and storing the context info in the object has never been offensive to me. Cc: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_debugfs.c | 11 ++- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_context.c | 2 ++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 7df8351..375c36a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -90,6 +90,14 @@ static const char *get_tiling_flag(struct drm_i915_gem_object *obj) } } +static const char *get_context_flag(struct drm_i915_gem_object *obj) +{ + if (obj-ctx) + return c; + else + return ; +} + static const char *cache_level_str(int type) { switch (type) { @@ -103,10 +111,11 @@ static const char *cache_level_str(int type) static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { - seq_printf(m, %p: %s%s %8zdKiB %02x %02x %d %d %d%s%s%s, + seq_printf(m, %p: %s%s%s %8zdKiB %02x %02x %d %d %d%s%s%s, obj-base, get_pin_flag(obj), get_tiling_flag(obj), + get_context_flag(obj), obj-base.size / 1024, obj-base.read_domains, obj-base.write_domain, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1657d873..da071a0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1214,6 +1214,8 @@ struct drm_i915_gem_object { /** for phy allocated objects */ struct drm_i915_gem_phys_object *phys_obj; + + struct i915_hw_context *ctx; }; #define to_gem_object(obj) (((struct drm_i915_gem_object *)(obj))-base) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 6211637..8e218ad 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -150,6 +150,8 @@ create_hw_context(struct drm_device *dev, return ERR_PTR(-ENOMEM); } + ctx-obj-ctx = ctx; + /* The ring associated with the context object is handled by the normal * object tracking code. We give an initial ring value simple to pass an * assertion in the context switch code. -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/7] drm/i915: Print all contexts in debugfs
Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_debugfs.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 375c36a..eef2575 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1471,11 +1471,23 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data) return 0; } +static int context_show(int id, void *p, void *data) +{ + struct i915_hw_context *ctx = p; + struct seq_file *m = data; + + seq_printf(m, context = %d\n, id); + describe_obj(m, ctx-obj); + seq_printf(m, \n); + return 0; +} + static int i915_context_status(struct seq_file *m, void *unused) { struct drm_info_node *node = (struct drm_info_node *) m-private; struct drm_device *dev = node-minor-dev; drm_i915_private_t *dev_priv = dev-dev_private; + struct drm_file *file; struct intel_ring_buffer *ring; int ret, i; @@ -1503,6 +1515,13 @@ static int i915_context_status(struct seq_file *m, void *unused) } } + list_for_each_entry(file, dev-filelist, lhead) { + struct drm_i915_file_private *file_priv = file-driver_priv; + seq_printf(m, File = %p , file); + idr_for_each(file_priv-context_idr, context_show, m); + seq_printf(m, \n); + } + mutex_unlock(dev-mode_config.mutex); return 0; -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/7] drm/i915: Track context status
Instead of using just an is_initialized flag, it will be helpful to have a bit more information about the context's actual status primarily for the case when the file is closed before the context has actually be destroyed. Could also be useful for debugging. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_drv.h | 4 +++- drivers/gpu/drm/i915/i915_gem_context.c | 4 ++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index da071a0..25cdade 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -439,7 +439,9 @@ struct i915_hw_ppgtt { #define DEFAULT_CONTEXT_ID 0 struct i915_hw_context { int id; - bool is_initialized; + enum { + I915_CTX_INITIALIZED=1, + } status; struct drm_i915_file_private *file_priv; struct intel_ring_buffer *ring; struct drm_i915_gem_object *obj; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8b2e73a..41be2a5 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -383,7 +383,7 @@ static int do_switch(struct i915_hw_context *to) if (!to-obj-has_global_gtt_mapping) i915_gem_gtt_bind_object(to-obj, to-obj-cache_level); - if (!to-is_initialized || is_default_context(to)) + if (!to-status || is_default_context(to)) hw_flags |= MI_RESTORE_INHIBIT; else if (WARN_ON_ONCE(from_obj == to-obj)) /* not yet expected */ hw_flags |= MI_FORCE_RESTORE; @@ -419,7 +419,7 @@ static int do_switch(struct i915_hw_context *to) drm_gem_object_reference(to-obj-base); ring-last_context_obj = to-obj; - to-is_initialized = true; + to-status = I915_CTX_INITIALIZED; return 0; } -- 1.8.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/7] drm/i915: Store last context instead of the obj
Storing the last context requires refcounting. Mika recently submitted some refcounting patches which leverages our request mechanism. This is insufficient for my needs because we want to know the last context even if the request has ended, ie. doing the kref_put when a request is finished isn't okay (unless we switch back to the default context, and wait for the switch) Context lifecycle works identically to the way the context object lifecycle works. As of now, I'm not making use of the context_status field. It seems like it will be useful at least for debugging, but we can drop it if desired. Cc: Mika Kuoppala mika.kuopp...@linux.intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_drv.h | 4 ++ drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_context.c | 83 - drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 4 files changed, 67 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 25cdade..3025b65 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -438,9 +438,12 @@ struct i915_hw_ppgtt { /* This must match up with the value previously used for execbuf2.rsvd1. */ #define DEFAULT_CONTEXT_ID 0 struct i915_hw_context { + struct kref ref; int id; enum { I915_CTX_INITIALIZED=1, + I915_CTX_FILE_CLOSED, + I915_CTX_DESTROY_IOCTL, } status; struct drm_i915_file_private *file_priv; struct intel_ring_buffer *ring; @@ -1666,6 +1669,7 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev, struct drm_gem_object *gem_obj, int flags); /* i915_gem_context.c */ +void ctx_release(struct kref *kref); void i915_gem_context_init(struct drm_device *dev); void i915_gem_context_fini(struct drm_device *dev); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8a2cbee..4097173 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1924,6 +1924,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj) obj-fenced_gpu_access = false; obj-active = 0; + if (obj-ctx) + kref_put(obj-ctx-ref, ctx_release); drm_gem_object_unreference(obj-base); WARN_ON(i915_verify_lists(dev)); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 41be2a5..f57c91a 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -124,13 +124,25 @@ static int get_context_size(struct drm_device *dev) static void do_destroy(struct i915_hw_context *ctx) { - if (ctx-file_priv) + /* Need to remove the idr if close/destroy was called while the context +* was active +*/ + if (ctx-status == I915_CTX_DESTROY_IOCTL) idr_remove(ctx-file_priv-context_idr, ctx-id); drm_gem_object_unreference(ctx-obj-base); + if (WARN_ON(kref_get_unless_zero(ctx-ref))) + kref_put(ctx-ref, ctx_release); kfree(ctx); } +void ctx_release(struct kref *kref) +{ + struct i915_hw_context *ctx = container_of(kref, + struct i915_hw_context, ref); + do_destroy(ctx); +} + static struct i915_hw_context * create_hw_context(struct drm_device *dev, struct drm_i915_file_private *file_priv) @@ -159,8 +171,10 @@ create_hw_context(struct drm_device *dev, ctx-ring = dev_priv-ring[RCS]; /* Default context will never have a file_priv */ - if (file_priv == NULL) + if (file_priv == NULL) { + kref_init(ctx-ref); return ctx; + } ctx-file_priv = file_priv; @@ -169,6 +183,7 @@ create_hw_context(struct drm_device *dev, if (ret 0) goto err_out; ctx-id = ret; + kref_init(ctx-ref); return ctx; @@ -209,6 +224,8 @@ static int create_default_context(struct drm_i915_private *dev_priv) if (ret) goto err_destroy; + kref_get(ctx-ref); + ret = do_switch(ctx); if (ret) goto err_unpin; @@ -266,7 +283,8 @@ void i915_gem_context_fini(struct drm_device *dev) i915_gem_object_unpin(dev_priv-ring[RCS].default_context-obj); - do_destroy(dev_priv-ring[RCS].default_context); + while (!kref_put(dev_priv-ring[RCS].default_context-ref, +ctx_release)); } static int context_idr_cleanup(int id, void *p, void *data) @@ -275,8 +293,11 @@ static int context_idr_cleanup(int id, void *p, void *data) BUG_ON(id == DEFAULT_CONTEXT_ID); - DRM_DEBUG_DRIVER(Context %d closed before destroy.\n,
Re: [Intel-gfx] [PATCH 0/7] [RFC] Context reference counting
On Thu, Apr 04, 2013 at 04:41:46PM -0700, Ben Widawsky wrote: These patches implement full context reference counting. In the patch that actually adds the reference counting, I explain why I think Mika's reference counting isn't sufficient for me. Please see/respond to that patch if you disagree. Almost all of the actual work occurs in: drm/i915: Store last context instead of the obj This work is preliminary work for what I'm actually doing at the moment which is proper PPGTT support. The patches split off quite nicely here, so I'm submitting it for review separately. I believe these patches provide a superset of the functionality needed by Mika for ARB_Robustness. The primary reason I want to track context destruction is for PPGTT support we'd like to teardown ppgtt state when a context goes away, but can only do so when we're absolutely certain those PTEs are no longer needed. In my design, I've made a 1:1 relationship with context-ppgtt, and so refcounting the context makes sense there. The crux of the solution here is to store the context pointer in the object that backs it. I could probably use that alone to solve my problem, but I've gone a bit further and also stored the last context in the ring, instead of the last context object. I can't show code yet, but I believe there are a couple of other niceties to having the last context, and not an object. I sent the wrong version of this text. What I meant is, the crux of the solution is reference counting + storing the context pointer in the object. I believe I am not /required/ to store the last_context as I've done. There is at least one sticking point in this patch series, which is the aforementioned storing of the context in the object backs the context. That patch has been rejected before. I'm open to other ways to handle this. At the very least, I require being able to teardown PPGTT when a context dies. For reference, here is the last time the patch was shot down: http://lists.freedesktop.org/archives/intel-gfx/2012-June/thread.html#18000 Ben Widawsky (7): drm/i915: Mark context switch likely drm/i915: Move context special case to get() drm/i915: Make object aware that it backs a context drm/i915: Better context messages drm/i915: Track context status drm/i915: Store last context instead of the obj drm/i915: Print all contexts in debugfs drivers/gpu/drm/i915/i915_debugfs.c | 30 +++- drivers/gpu/drm/i915/i915_drv.h | 10 ++- drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_context.c | 127 ++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 5 files changed, 129 insertions(+), 42 deletions(-) -- 1.8.2 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/4] drm: Add struct drm_rect and assorted utility functions
Hi Ville, On Thursday 04 April 2013 22:52:37 Ville Syrjälä wrote: On Thu, Apr 04, 2013 at 06:38:15PM +0200, Laurent Pinchart wrote: On Wednesday 27 March 2013 17:46:22 ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com struct drm_rect represents a simple rectangle. The utility functions are there to help driver writers. v2: Moved the region stuff into its own file, made the smaller funcs static inline, used 64bit maths in the scaled clipping function to avoid overflows (instead it will saturate to INT_MIN or INT_MAX). v3: Renamed drm_region to drm_rect, drm_region_clip to drm_rect_intersect, and drm_region_subsample to drm_rect_downscale. Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com --- drivers/gpu/drm/Makefile | 3 +- drivers/gpu/drm/drm_rect.c | 96 + include/drm/drm_rect.h | 132 ++ 3 files changed, 230 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/drm_rect.c create mode 100644 include/drm/drm_rect.h [snip] diff --git a/drivers/gpu/drm/drm_rect.c b/drivers/gpu/drm/drm_rect.c new file mode 100644 index 000..1ad4f5e --- /dev/null +++ b/drivers/gpu/drm/drm_rect.c @@ -0,0 +1,96 @@ [snip] +#include linux/errno.h +#include linux/export.h +#include linux/kernel.h +#include drm/drm_rect.h + +/** + * drm_rect_intersect - intersect two rectangles + * @r1: first rectangle + * @r2: second rectangle + * + * Calculate the intersection of rectangles @r1 and @r2. + * @r1 will be overwritten with the intersection. + * + * RETURNS: + * @true if rectangle @r1 is still visible after the operation, + * @false otherwise. Isn't @ used for function parameters only ? Not sure. It's been a while since I wrote these, and I guess I thought that the @ was there just for higlighting purposes. Looks like the documentation for the documentation system isn't that great :) so I guess I'll need to try building the docs and see what happens. + */ +bool drm_rect_intersect(struct drm_rect *r1, const struct drm_rect *r2) +{ + r1-x1 = max(r1-x1, r2-x1); + r1-y1 = max(r1-y1, r2-y1); + r1-x2 = min(r1-x2, r2-x2); + r1-y2 = min(r1-y2, r2-y2); + + return drm_rect_visible(r1); Do callers always need that information, or should they instead call drm_rect_visible() explicitly when they need it ? I suppose someone might call it w/o checking the visibility right away. In my current use case I do use the return value, so it saves one line of code :) But I don't mind changing it if you think that would be better w/o the implicit drm_rect_visible() call. I'm fine with whichever you think will be better. I just wanted to raise this point to make sure it has been thought about. +} +EXPORT_SYMBOL(drm_rect_intersect); + +/** + * drm_rect_clip_scaled - perform a scaled clip operation + * @src: source window rectangle + * @dst: destination window rectangle + * @clip: clip rectangle + * @hscale: horizontal scaling factor + * @vscale: vertical scaling factor + * + * Clip rectangle @dst by rectangle @clip. Clip rectangle @src by the + * same amounts multiplied by @hscale and @vscale. + * + * RETUTRNS: + * @true if rectangle @dst is still visible after being clipped, + * @false otherwise + */ +bool drm_rect_clip_scaled(struct drm_rect *src, struct drm_rect *dst, + const struct drm_rect *clip, + int hscale, int vscale) +{ + int diff; + + diff = clip-x1 - dst-x1; + if (diff 0) { + int64_t tmp = src-x1 + (int64_t) diff * hscale; + src-x1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = clip-y1 - dst-y1; + if (diff 0) { + int64_t tmp = src-y1 + (int64_t) diff * vscale; + src-y1 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-x2 - clip-x2; + if (diff 0) { + int64_t tmp = src-x2 - (int64_t) diff * hscale; + src-x2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + diff = dst-y2 - clip-y2; + if (diff 0) { + int64_t tmp = src-y2 - (int64_t) diff * vscale; + src-y2 = clamp_t(int64_t, tmp, INT_MIN, INT_MAX); + } + + return drm_rect_intersect(dst, clip); +} +EXPORT_SYMBOL(drm_rect_clip_scaled); diff --git a/include/drm/drm_rect.h b/include/drm/drm_rect.h new file mode 100644 index 000..40b09a4 --- /dev/null +++ b/include/drm/drm_rect.h @@ -0,0 +1,132 @@ [snip] +/** + * drm_rect - two dimensional rect + * @x1: horizontal starting coordinate (inclusive) + * @x2: horizontal ending coordinate (exclusive) + * @y1: vertical starting coordinate (inclusive) + * @y2: vertical ending coordinate
Re: [Intel-gfx] [PATCH 0/7] [RFC] Context reference counting
On Thu, Apr 04, 2013 at 04:41:58PM -0700, Ben Widawsky wrote: On Thu, Apr 04, 2013 at 04:41:46PM -0700, Ben Widawsky wrote: These patches implement full context reference counting. In the patch that actually adds the reference counting, I explain why I think Mika's reference counting isn't sufficient for me. Please see/respond to that patch if you disagree. Almost all of the actual work occurs in: drm/i915: Store last context instead of the obj This work is preliminary work for what I'm actually doing at the moment which is proper PPGTT support. The patches split off quite nicely here, so I'm submitting it for review separately. I believe these patches provide a superset of the functionality needed by Mika for ARB_Robustness. The primary reason I want to track context destruction is for PPGTT support we'd like to teardown ppgtt state when a context goes away, but can only do so when we're absolutely certain those PTEs are no longer needed. In my design, I've made a 1:1 relationship with context-ppgtt, and so refcounting the context makes sense there. The crux of the solution here is to store the context pointer in the object that backs it. I could probably use that alone to solve my problem, but I've gone a bit further and also stored the last context in the ring, instead of the last context object. I can't show code yet, but I believe there are a couple of other niceties to having the last context, and not an object. I sent the wrong version of this text. What I meant is, the crux of the solution is reference counting + storing the context pointer in the object. I believe I am not /required/ to store the last_context as I've done. I've managed to convince myself storing the last context vs. the last context object really isn't important. Just the refcounting matters. Since we have refcounting though, I feel storing last_context makes a bit more sense, but I don't really care about that aspect. For PPGTT (at least for current gens) we have to store the current address space globally as opposed to per ring. There is at least one sticking point in this patch series, which is the aforementioned storing of the context in the object backs the context. That patch has been rejected before. I'm open to other ways to handle this. At the very least, I require being able to teardown PPGTT when a context dies. For reference, here is the last time the patch was shot down: http://lists.freedesktop.org/archives/intel-gfx/2012-June/thread.html#18000 Ben Widawsky (7): drm/i915: Mark context switch likely drm/i915: Move context special case to get() drm/i915: Make object aware that it backs a context drm/i915: Better context messages drm/i915: Track context status drm/i915: Store last context instead of the obj drm/i915: Print all contexts in debugfs drivers/gpu/drm/i915/i915_debugfs.c | 30 +++- drivers/gpu/drm/i915/i915_drv.h | 10 ++- drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_context.c | 127 ++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 5 files changed, 129 insertions(+), 42 deletions(-) -- 1.8.2 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx