Re: [Intel-gfx] [PATCH 04/12] drm/i915: Don't unconditionally try to deref aliasing ppgtt
On Mon, Nov 25, 2013 at 09:08:50PM +0200, Ville Syrjälä wrote: On Mon, Nov 25, 2013 at 10:10:28AM -0800, Ben Widawsky wrote: On Mon, Nov 25, 2013 at 06:06:18PM +, Chris Wilson wrote: On Mon, Nov 25, 2013 at 09:54:35AM -0800, Ben Widawsky wrote: Since the beginning, the functions which try to properly reference the aliasing PPGTT have deferences a potentially null aliasing_ppgtt member. Since the accessors are meant to be global, this will not do. Introduced originally in: commit a70a3148b0c61cb7c588ea650db785b261b378a3 Author: Ben Widawsky b...@bwidawsk.net Date: Wed Jul 31 16:59:56 2013 -0700 drm/i915: Make proper functions for VMs Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 40d9dcf..bc5c865 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4971,7 +4971,8 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o, struct drm_i915_private *dev_priv = o-base.dev-dev_private; struct i915_vma *vma; - if (vm == dev_priv-mm.aliasing_ppgtt-base) + if (!dev_priv-mm.aliasing_ppgtt || + vm == dev_priv-mm.aliasing_ppgtt-base) Where's the dereference? gcc is smarter than your average bear. -Chris I had assumed: dev_priv-mm.aliasing_ppgtt-base in cases when aliasing_ppgtt was NULL. Given that I never actually hit this, I agree GCC must be doing something. Sounds like another discussion on the implementation of offsetof(). Is such behavior documented somewhere? (forgive the lazy) IIRC the C standard does say that for *foo it's as if both and * weren't there (and IIRC the same for foo[x]), but doesn't really say that kind of thing for foo-bar. I guess it's a gray area, and just happens work that way on certain compilers. In this particular case I think there's one slight issue. If aliasing_pggtt == NULL, and someone passes in vm == NULL by accident, then it'll take the branch and use ggtt because aliasing_ppgtt-base will be NULL too (due to base being at offset 0 in the struct). Now, I don't know if a NULL vm could survive this far into the code, but it it can, then it might make debugging a bit more fun. If you pass in a NULL vm then you get to keep both pieces. And wrt portability atm you can pick any compiler for the linux kernel as long as it's gcc compatible. So imo nothing to really worry about. Aside: We really should start to ditch all these (vm, obj) pair lookups and move to using vmas everywhere ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 18/19] drm/i915: save some time when waiting the eDP timings
On Mon, Nov 25, 2013 at 06:38:53PM -0800, Ben Widawsky wrote: On Mon, Nov 25, 2013 at 11:25:10PM +, Chris Wilson wrote: On Mon, Nov 25, 2013 at 02:17:55PM -0800, Ben Widawsky wrote: On Thu, Nov 21, 2013 at 04:00:17PM +, Chris Wilson wrote: On Thu, Nov 21, 2013 at 01:47:32PM -0200, Paulo Zanoni wrote: From: Paulo Zanoni paulo.r.zan...@intel.com The eDP spec defines some points where after you do action A, you have to wait some time before action B. The thing is that in our driver action B does not happen exactly after action A, but we still use msleep() calls directly. What this patch happens is that we record the timestamp of when action A happened, then, just before action B, we look at how much time has passed and only sleep the remaining amount needed. With this change, I am able to save about 5-20ms (out of the total 200ms) of the backlight_off delay and completely skip the 1ms backlight_on delay. The 600ms vdd_off delay doesn't happen during normal usage anymore due to a previous patch. Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/intel_dp.c | 38 +++--- drivers/gpu/drm/i915/intel_drv.h | 3 +++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index b438e76..3a1ca80 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1051,12 +1051,41 @@ static void ironlake_wait_panel_off(struct intel_dp *intel_dp) ironlake_wait_panel_status(intel_dp, IDLE_OFF_MASK, IDLE_OFF_VALUE); } +static void ironlake_wait_jiffies_delay(unsigned long timestamp, + int to_wait_ms) This is not hw specific, so just intel_wait_until_after(timestamp_jiffies, to_wait_ms) Can't we do this with our existing wait_for, and get all the other junk we've crammed in there? wait_for(false, timestamp + to_wait_ms) Or do I have this all wrong? It would be wait_for(false, jiffies_to_ms(max(ms_to_jiffies(timestamp+to_wait_ms) - jiffies, 0)) or something pretty similar. Definitely macro abuse as you would be hoping that the compiler turned while (!time_after(jiffies, timeout)) msleep(1); into msleep(to_ms(timeout-jiffies)); So perhaps clearer would be: intel_wait_until_after(unsigned long timestamp_jiffies, int to_wait_ms) { timestamp_jiffies += msec_to_jiffes(to_wait_ms) + 1; if (time_after(timestamp_jiffies, jiffies) { timestamp_jiffies -= jiffies; while (timestamp_jiffies) timestamp_jiffies = schedule_timeout(timestamp_jiffies); } } Oops, I meant timestamp + to_wait_ms - jiffies_to_msecs(jiffies) I agree it does get a bit messy, but I think there is likely a cleaner way that what you propose if we store things as jiffies. I hate dealing with jiffies, and I feel like even your function has a race with jiffies though. Sure, I was contemplating using an extra variable to only access jiffies once, but I was lazy. Not mixing units here would make it less likely to misuse. However, I think the point stands that we can write a simple function that is clearer than abusing wait_for(). We will wait and see what Paulo creates. :) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/12] drm/i915: Allow ggtt lookups to not WARN
On Mon, Nov 25, 2013 at 09:54:41AM -0800, Ben Widawsky wrote: To be able to effectively use the GGTT object lookup function, we don't want to warn when there is no GGTT mapping. Let the caller deal with it instead. Originally, I had intended to have this behavior, and has not introduced the WARN. It was introduced during review with the addition of the follow commit commit 5c2abbeab798154166d42fce4f71790caa6dd9bc Author: Ben Widawsky benjamin.widaw...@intel.com Date: Tue Sep 24 09:57:57 2013 -0700 drm/i915: Provide a cheap ggtt vma lookup Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index bc5c865..6388706 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5069,7 +5069,7 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj) return NULL; vma = list_first_entry(obj-vma_list, typeof(*vma), vma_link); - if (WARN_ON(vma-vm != obj_to_ggtt(obj))) + if (vma-vm != obj_to_ggtt(obj)) Makes sense, but then please add a must_check annotation to this function. That will point you at the user in the context code which imo deserves a BUG_ON or similar. I guess we should also convert over the little ggtt helpers to use this function. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 11/12] drm/i915: Takedown drm_mm on failed gtt setup
On Mon, Nov 25, 2013 at 09:54:42AM -0800, Ben Widawsky wrote: This was found by code inspection. If the GTT setup fails then we are left without properly tearing down the drm_mm. Hopefully this never happens. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 6388706..d85bc3e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4504,6 +4504,7 @@ int i915_gem_init(struct drm_device *dev) mutex_unlock(dev-struct_mutex); if (ret) { i915_gem_cleanup_aliasing_ppgtt(dev); + drm_mm_takedown(dev_priv-gtt.base.mm); Sprinkling stuff all over the place feels bad, I guess our gem init code is ripe for an overhaul. I've pulled this in anyway. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] VPG [BDW] drm/i915/bdw: DP aux channel control timeout change
On Tue, Nov 26, 2013 at 01:41:57PM +0530, kirankumar wrote: changing DP AUX control timeout to 600 instead of 400 for BDW. DP spec says 400 us, so a notch more context in the commit message would be appreciated. E.g. if this is used to work around dp aux failures, then the commit message _must_ mention this. I need this to asses in which branch to put the patch. Also we still have occasional dp aux woes on older platforms. Do we need a similar timeout extension for those, too? Remember people: The commit message must explain _why_ something needs to change. If you feel like you need to explain the what of your change, your code probably needs to be rewritten to be esier to understand. Cheers, Daniel Change-Id: Id507ef35490e12bbd896492c9da24d0db367bf1a Signed-off-by: kirankumar kiran.s.ku...@intel.com --- drivers/gpu/drm/i915/intel_dp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index ae2a057..953a3f6 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -351,7 +351,7 @@ intel_dp_aux_ch(struct intel_dp *intel_dp, I915_WRITE(ch_ctl, DP_AUX_CH_CTL_SEND_BUSY | (has_aux_irq ? DP_AUX_CH_CTL_INTERRUPT : 0) | -DP_AUX_CH_CTL_TIME_OUT_400us | +DP_AUX_CH_CTL_TIME_OUT_600us | (send_bytes DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT) | (precharge DP_AUX_CH_CTL_PRECHARGE_2US_SHIFT) | (aux_clock_divider DP_AUX_CH_CTL_BIT_CLOCK_2X_SHIFT) | -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Drop forcewake w/a for missed interrupts/seqno on Sandybridge
On Fri, Nov 22, 2013 at 12:46:40PM -0800, Jesse Barnes wrote: On Fri, 22 Nov 2013 20:35:24 + Chris Wilson ch...@chris-wilson.co.uk wrote: I believe, and an evening of i-g-t, that our original workaround for the missed interrupts on Sandybridge, that of holding forcewake whilst we wait for an interrupts, is no longer required. This leaves us dependent on the second workaround of forcing an UC read of the ACTHD before reading back the seqno from the snooped HWS. Dropping the forcewake should allow us to conserve a little power, not much as the GPU is meant to be busy whilst we wait for it! Cc: Jesse Barnes jbar...@virtuousgeek.org Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 80893af4062b..6b121ba9d3f7 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1030,11 +1030,6 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring) if (!dev-irq_enabled) return false; - /* It looks like we need to prevent the gt from suspending while waiting -* for an notifiy irq, otherwise irqs seem to get lost on at least the -* blt/bsd rings on ivb. */ - gen6_gt_force_wake_get(dev_priv); - spin_lock_irqsave(dev_priv-irq_lock, flags); if (ring-irq_refcount++ == 0) { if (HAS_L3_DPF(dev) ring-id == RCS) @@ -1066,8 +1061,6 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring) ilk_disable_gt_irq(dev_priv, ring-irq_enable_mask); } spin_unlock_irqrestore(dev_priv-irq_lock, flags); - - gen6_gt_force_wake_put(dev_priv); } static bool Yay! This is just what I was testing on BYT today, and it works there too (and gives me tons more RC6 residency). Risky imo, since I see missed interrupts sporadically on all my gen6+ machines, and we don't have any solid testcase in igt that readily hits them. Expect a swift strike with the revert hammer if this causes trouble ;-) Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org Queued for -next, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915/vlv: Valleyview support for forcewake Individual power wells.
Hi Chris, We are using single write fifo for the multiple power wells. Since Valleyview as only supports only one write fifo. Thanks Deepak -Original Message- From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] Sent: Monday, November 25, 2013 8:18 PM To: S, Deepak Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/vlv: Valleyview support for forcewake Individual power wells. On Sat, Nov 23, 2013 at 02:55:43PM +0530, deepa...@intel.com wrote: From: Deepak S deepa...@intel.com Split vlv force wake routines to help individually control Media/Render well based on the register access. Do you mind clarifying if there if a single write fifo for the multiple power wells? Just something that worried me reading through the code. Otherwise, lgtm. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Enabling DebugFS for valleyview forcewake counts
Yes Jesse, this is a warning fix. -Original Message- From: Jesse Barnes [mailto:jbar...@virtuousgeek.org] Sent: Monday, November 25, 2013 10:09 PM To: S, Deepak Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 3/3] drm/i915: Enabling DebugFS for valleyview forcewake counts On Sat, 23 Nov 2013 14:55:44 +0530 deepa...@intel.com wrote: @@ -2349,7 +2357,7 @@ static int pipe_crc_set_source(struct drm_device *dev, enum pipe pipe, { struct drm_i915_private *dev_priv = dev-dev_private; struct intel_pipe_crc *pipe_crc = dev_priv-pipe_crc[pipe]; - u32 val; + u32 val = 0; int ret; if (pipe_crc-source == source) Spurious warning fix? Otherwise looks fine. Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer
As the execbuffer dispatch grows ever more complex and involves multiple stages of moving objects into the aperture, we need to take greater care that we do not evict our execbuffer objects prior to dispatch. This is relatively simple as we can just keep the objects pinned for not just the relocation but until we are finished. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky benjamin.widaw...@intel.com Cc: Daniel Vetter dan...@ffwll.ch Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 -- 1 file changed, 32 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 885d595e0e02..b7e787fb4649 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -33,6 +33,9 @@ #include intel_drv.h #include linux/dma_remapping.h +#define __EXEC_OBJECT_HAS_PIN (131) +#define __EXEC_OBJECT_HAS_FENCE (130) + struct eb_vmas { struct list_head vmas; int and; @@ -187,7 +190,28 @@ static struct i915_vma *eb_get_vma(struct eb_vmas *eb, unsigned long handle) } } -static void eb_destroy(struct eb_vmas *eb) { +static void +i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma) +{ + struct drm_i915_gem_exec_object2 *entry; + struct drm_i915_gem_object *obj = vma-obj; + + if (!drm_mm_node_allocated(vma-node)) + return; + + entry = vma-exec_entry; + + if (entry-flags __EXEC_OBJECT_HAS_FENCE) + i915_gem_object_unpin_fence(obj); + + if (entry-flags __EXEC_OBJECT_HAS_PIN) + i915_gem_object_unpin(obj); + + entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN); +} + +static void eb_destroy(struct eb_vmas *eb) +{ while (!list_empty(eb-vmas)) { struct i915_vma *vma; @@ -195,6 +219,7 @@ static void eb_destroy(struct eb_vmas *eb) { struct i915_vma, exec_list); list_del_init(vma-exec_list); + i915_gem_execbuffer_unreserve_vma(vma); drm_gem_object_unreference(vma-obj-base); } kfree(eb); @@ -478,9 +503,6 @@ i915_gem_execbuffer_relocate(struct eb_vmas *eb, return ret; } -#define __EXEC_OBJECT_HAS_PIN (131) -#define __EXEC_OBJECT_HAS_FENCE (130) - static int need_reloc_mappable(struct i915_vma *vma) { @@ -552,26 +574,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, return 0; } -static void -i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma) -{ - struct drm_i915_gem_exec_object2 *entry; - struct drm_i915_gem_object *obj = vma-obj; - - if (!drm_mm_node_allocated(vma-node)) - return; - - entry = vma-exec_entry; - - if (entry-flags __EXEC_OBJECT_HAS_FENCE) - i915_gem_object_unpin_fence(obj); - - if (entry-flags __EXEC_OBJECT_HAS_PIN) - i915_gem_object_unpin(obj); - - entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN); -} - static int i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, struct list_head *vmas, @@ -670,13 +672,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, goto err; } -err: /* Decrement pin count for bound objects */ - list_for_each_entry(vma, vmas, exec_list) - i915_gem_execbuffer_unreserve_vma(vma); - +err: if (ret != -ENOSPC || retry++) return ret; + /* Decrement pin count for bound objects */ + list_for_each_entry(vma, vmas, exec_list) + i915_gem_execbuffer_unreserve_vma(vma); + ret = i915_gem_evict_vm(vm, true); if (ret) return ret; @@ -708,6 +711,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, while (!list_empty(eb-vmas)) { vma = list_first_entry(eb-vmas, struct i915_vma, exec_list); list_del_init(vma-exec_list); + i915_gem_execbuffer_unreserve_vma(vma); drm_gem_object_unreference(vma-obj-base); } -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer
On Tue, Nov 26, 2013 at 11:23:15AM +, Chris Wilson wrote: As the execbuffer dispatch grows ever more complex and involves multiple stages of moving objects into the aperture, we need to take greater care that we do not evict our execbuffer objects prior to dispatch. This is relatively simple as we can just keep the objects pinned for not just the relocation but until we are finished. One such example is the possibility of the context switch causing an eviction or hitting the shrinker in order to fit its object into the aperture. Link: http://lists.freedesktop.org/archives/intel-gfx/2013-November/036166.html Reported-by: Siluvery, Arun arun.siluv...@intel.com Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky benjamin.widaw...@intel.com Cc: Daniel Vetter dan...@ffwll.ch Cc: sta...@vger.kernel.org -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2
On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote: And move it up in the file for earlier usage. v2: add pre-gen4 support as well (Chris) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4
On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote: Read out the current plane configuration at init time into a new plane_config structure. This allows us to track any existing framebuffers attached to the plane and potentially re-use them in our fbdev code for a smooth handoff. v2: update for new pitch_for_width function (Jesse) comment how get_plane_config works with shared fbs (Jesse) v3: s/ARGB/XRGB (Ville) use pipesrc width/height (Ville) fix fourcc comment (Bob) use drm_format_plane_cpp (Ville) v4: use fb for tracking fb data object (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- @@ -10879,6 +10992,18 @@ void intel_modeset_init(struct drm_device *dev) /* Just in case the BIOS is doing something questionable. */ intel_disable_fbc(dev); + + intel_modeset_setup_hw_state(dev, false); + + list_for_each_entry(crtc, dev-mode_config.crtc_list, + base.head) { + if (!crtc-active) + continue; + + if (dev_priv-display.get_plane_config) + dev_priv-display.get_plane_config(crtc, +crtc-plane_config); The trick is that here if we do not retreive the current config, including the preallocated fb, we *must* disable the output. In this case, it would be a step in intel_sanitize_crtc() to disable the CRTC if it is enabled but we have no preserved fb. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7
On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote: Retrieve current framebuffer config info from the regs and create an fb object for the buffer the BIOS or boot loader left us. This should allow for smooth transitions to userspace apps once we finish the initial configuration construction. v2: check for non-native modes and adjust (Jesse) fixup aperture and cmap frees (Imre) use unlocked unref if init_bios fails (Jesse) fix curly brace around DSPADDR check (Imre) comment failure path for pin_and_fence (Imre) v3: fixup fixup of aperture frees (Chris) v4: update to current bits (locking pin_and_fence hack) (Jesse) v5: move fb config fetch to display code (Jesse) re-order hw state readout on initial load to suit fb inherit (Jesse) re-add pin_and_fence in fbdev code to make sure we refcount properly (Je v6: rename to plane_config (Daniel) check for valid object when initializing BIOS fb (Jesse) split from plane_config readout and other display changes (Jesse) drop use_bios_fb option (Chris) update comments (Jesse) rework fbdev_init_bios for clarity (Jesse) drop fb obj ref under lock (Chris) v7: use fb object from plane_config instead (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding the kzalloc(intel_fbdev) and the clarity of intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb lifetime of plane_config-fb is extremely ugly though (the theft could be made a little more obvious for instance) and still leaked upon failure? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: don't memset the fb buffer if preallocated
On Mon, Nov 25, 2013 at 03:51:19PM -0800, Jesse Barnes wrote: We want to preserve the BIOS/bootloader contents for later. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Decided this was better than my bikeshed, Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: check context reset stats before relocations
Doing it early prevents moving and relocating objects in vain for contexts that won't get any GPU time. Reported-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 38 ++-- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 885d595..61c716d 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -885,6 +885,24 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, return 0; } +static int +i915_gem_validate_context(struct drm_device *dev, struct drm_file *file, + const u32 ctx_id) +{ + struct i915_ctx_hang_stats *hs; + + hs = i915_gem_context_get_hang_stats(dev, file, ctx_id); + if (IS_ERR(hs)) + return PTR_ERR(hs); + + if (hs-banned) { + DRM_DEBUG(Context %u tried to submit while banned\n, ctx_id); + return -EIO; + } + + return 0; +} + static void i915_gem_execbuffer_move_to_active(struct list_head *vmas, struct intel_ring_buffer *ring) @@ -964,8 +982,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_i915_gem_object *batch_obj; struct drm_clip_rect *cliprects = NULL; struct intel_ring_buffer *ring; - struct i915_ctx_hang_stats *hs; - u32 ctx_id = i915_execbuffer2_get_context_id(*args); + const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 exec_start, exec_len; u32 mask, flags; int ret, mode, i; @@ -1102,6 +1119,12 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto pre_mutex_err; } + ret = i915_gem_validate_context(dev, file, ctx_id); + if (ret) { + mutex_unlock(dev-struct_mutex); + goto pre_mutex_err; + } + eb = eb_create(args, vm); if (eb == NULL) { mutex_unlock(dev-struct_mutex); @@ -1154,17 +1177,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; - hs = i915_gem_context_get_hang_stats(dev, file, ctx_id); - if (IS_ERR(hs)) { - ret = PTR_ERR(hs); - goto err; - } - - if (hs-banned) { - ret = -EIO; - goto err; - } - ret = i915_switch_context(ring, file, ctx_id); if (ret) goto err; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: check context reset stats before relocations
On Tue, Nov 26, 2013 at 04:14:33PM +0200, Mika Kuoppala wrote: Doing it early prevents moving and relocating objects in vain for contexts that won't get any GPU time. Reported-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 18/19] drm/i915: save some time when waiting the eDP timings
2013/11/26 Chris Wilson ch...@chris-wilson.co.uk: On Mon, Nov 25, 2013 at 06:38:53PM -0800, Ben Widawsky wrote: On Mon, Nov 25, 2013 at 11:25:10PM +, Chris Wilson wrote: On Mon, Nov 25, 2013 at 02:17:55PM -0800, Ben Widawsky wrote: On Thu, Nov 21, 2013 at 04:00:17PM +, Chris Wilson wrote: On Thu, Nov 21, 2013 at 01:47:32PM -0200, Paulo Zanoni wrote: From: Paulo Zanoni paulo.r.zan...@intel.com The eDP spec defines some points where after you do action A, you have to wait some time before action B. The thing is that in our driver action B does not happen exactly after action A, but we still use msleep() calls directly. What this patch happens is that we record the timestamp of when action A happened, then, just before action B, we look at how much time has passed and only sleep the remaining amount needed. With this change, I am able to save about 5-20ms (out of the total 200ms) of the backlight_off delay and completely skip the 1ms backlight_on delay. The 600ms vdd_off delay doesn't happen during normal usage anymore due to a previous patch. Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com --- drivers/gpu/drm/i915/intel_dp.c | 38 +++--- drivers/gpu/drm/i915/intel_drv.h | 3 +++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index b438e76..3a1ca80 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1051,12 +1051,41 @@ static void ironlake_wait_panel_off(struct intel_dp *intel_dp) ironlake_wait_panel_status(intel_dp, IDLE_OFF_MASK, IDLE_OFF_VALUE); } +static void ironlake_wait_jiffies_delay(unsigned long timestamp, + int to_wait_ms) This is not hw specific, so just intel_wait_until_after(timestamp_jiffies, to_wait_ms) Can't we do this with our existing wait_for, and get all the other junk we've crammed in there? wait_for(false, timestamp + to_wait_ms) Or do I have this all wrong? It would be wait_for(false, jiffies_to_ms(max(ms_to_jiffies(timestamp+to_wait_ms) - jiffies, 0)) or something pretty similar. Definitely macro abuse as you would be hoping that the compiler turned while (!time_after(jiffies, timeout)) msleep(1); into msleep(to_ms(timeout-jiffies)); So perhaps clearer would be: intel_wait_until_after(unsigned long timestamp_jiffies, int to_wait_ms) { timestamp_jiffies += msec_to_jiffes(to_wait_ms) + 1; if (time_after(timestamp_jiffies, jiffies) { timestamp_jiffies -= jiffies; while (timestamp_jiffies) timestamp_jiffies = schedule_timeout(timestamp_jiffies); } } Oops, I meant timestamp + to_wait_ms - jiffies_to_msecs(jiffies) I agree it does get a bit messy, but I think there is likely a cleaner way that what you propose if we store things as jiffies. I hate dealing with jiffies, and I feel like even your function has a race with jiffies though. Sure, I was contemplating using an extra variable to only access jiffies once, but I was lazy. Not mixing units here would make it less likely to misuse. However, I think the point stands that we can write a simple function that is clearer than abusing wait_for(). We will wait and see what Paulo creates. :) I really don't think abusing wait_for is a good idea. I think wait_for has a different purpose, and trying to invent clever ways to use it on a non-standard use case will just make the code hard to read, without any benefits. And with the risk that future changes to wait_for will break our code. Also, I think Chris' solution is very similar to my solution, but with an open-coded msleep() instead of just the simple msleep() call. So I'd really vote to apply the KISS design pattern and stick to the original implementation + Chris' first suggestions (rename the function, fix the msleep call), unless we can find another bug on it. But since applying whatever bikesheds we get is always the fastest (only?) way to gets patches merged, if you insist I will just do whatever is requested. Thanks, Paulo -Chris -- Chris Wilson, Intel Open Source Technology Centre -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4
On Tue, 26 Nov 2013 13:57:10 + Chris Wilson ch...@chris-wilson.co.uk wrote: On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote: Read out the current plane configuration at init time into a new plane_config structure. This allows us to track any existing framebuffers attached to the plane and potentially re-use them in our fbdev code for a smooth handoff. v2: update for new pitch_for_width function (Jesse) comment how get_plane_config works with shared fbs (Jesse) v3: s/ARGB/XRGB (Ville) use pipesrc width/height (Ville) fix fourcc comment (Bob) use drm_format_plane_cpp (Ville) v4: use fb for tracking fb data object (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- @@ -10879,6 +10992,18 @@ void intel_modeset_init(struct drm_device *dev) /* Just in case the BIOS is doing something questionable. */ intel_disable_fbc(dev); + + intel_modeset_setup_hw_state(dev, false); + + list_for_each_entry(crtc, dev-mode_config.crtc_list, + base.head) { + if (!crtc-active) + continue; + + if (dev_priv-display.get_plane_config) + dev_priv-display.get_plane_config(crtc, + crtc-plane_config); The trick is that here if we do not retreive the current config, including the preallocated fb, we *must* disable the output. In this case, it would be a step in intel_sanitize_crtc() to disable the CRTC if it is enabled but we have no preserved fb. Yeah, but I thought since that's been broken for awhile, it should come in as a separate bug fix. I can do a patch on top of this if the rest looks ok. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7
On Tue, 26 Nov 2013 14:09:48 + Chris Wilson ch...@chris-wilson.co.uk wrote: On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote: Retrieve current framebuffer config info from the regs and create an fb object for the buffer the BIOS or boot loader left us. This should allow for smooth transitions to userspace apps once we finish the initial configuration construction. v2: check for non-native modes and adjust (Jesse) fixup aperture and cmap frees (Imre) use unlocked unref if init_bios fails (Jesse) fix curly brace around DSPADDR check (Imre) comment failure path for pin_and_fence (Imre) v3: fixup fixup of aperture frees (Chris) v4: update to current bits (locking pin_and_fence hack) (Jesse) v5: move fb config fetch to display code (Jesse) re-order hw state readout on initial load to suit fb inherit (Jesse) re-add pin_and_fence in fbdev code to make sure we refcount properly (Je v6: rename to plane_config (Daniel) check for valid object when initializing BIOS fb (Jesse) split from plane_config readout and other display changes (Jesse) drop use_bios_fb option (Chris) update comments (Jesse) rework fbdev_init_bios for clarity (Jesse) drop fb obj ref under lock (Chris) v7: use fb object from plane_config instead (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding the kzalloc(intel_fbdev) and the clarity of intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb lifetime of plane_config-fb is extremely ugly though (the theft could be made a little more obvious for instance) and still leaked upon failure? I free it on failure in fb_init_bios, but maybe I missed an error path? I agree on the lifetime handling, but it was either this or an ugly copy of the fb object in the bios function. And even then we'd have to find a place to free it that made sense (there is no such place today). I did comment the field in the struct at least... -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 01/22] drm/i915: Add data structures for command parser
From: Brad Volkin bradley.d.vol...@intel.com The command parser needs to know a few things about certain commands in order to process them correctly. Add structures for storing that information. OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 51 + 1 file changed, 51 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 14f250a..ff1e201 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1731,6 +1731,57 @@ struct drm_i915_file_private { atomic_t rps_wait_boost; }; +/** + * A command that requires special handling by the command parser. + */ +struct drm_i915_cmd_descriptor { + /** +* Flags describing how the command parser processes the command. +* +* CMD_DESC_FIXED: The command has a fixed length if this is set, +* a length mask if not set +* CMD_DESC_SKIP: The command is allowed but does not follow the +*standard length encoding for the opcode range in +*which it falls +*/ + u32 flags; +#define CMD_DESC_FIXED (10) +#define CMD_DESC_SKIP (11) + + /** +* The command's unique identification bits and the bitmask to get them. +* This isn't strictly the opcode field as defined in the spec and may +* also include type, subtype, and/or subop fields. +*/ + struct { + u32 value; + u32 mask; + } cmd; + + /** +* The command's length. The command is either fixed length (i.e. does +* not include a length field) or has a length field mask. The flag +* CMD_DESC_FIXED indicates a fixed length. Otherwise, the command has +* a length mask. All command entries in a command table must include +* length information. +*/ + union { + u32 fixed; + u32 mask; + } length; +}; + +/** + * A table of commands requiring special handling by the command parser. + * + * Each ring has an array of tables. Each table consists of an array of command + * descriptors, which must be sorted with command opcodes in ascending order. + */ +struct drm_i915_cmd_table { + const struct drm_i915_cmd_descriptor *table; + int count; +}; + #define INTEL_INFO(dev)(to_i915(dev)-info) #define IS_I830(dev) ((dev)-pdev-device == 0x3577) -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 04/22] drm/i915: Add per-ring command length decode functions
From: Brad Volkin bradley.d.vol...@intel.com For commands that aren't in the parser's tables, we get the length based on standard per-ring command encodings for specific opcode ranges. These functions just return the bitmask and the parser will extract the actual length value. OTC-Tracker: AXIA-4631 Change-Id: I2729d4483931cb4aea9403fd43710c4d4e8e5e89 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 62 + drivers/gpu/drm/i915/intel_ringbuffer.h | 12 +++ 2 files changed, 74 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 014e661..247d530 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -137,6 +137,62 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = { { blt_cmds, ARRAY_SIZE(blt_cmds) }, }; +#define CLIENT_MASK 0xE000 +#define SUBCLIENT_MASK 0x1800 +#define MI_CLIENT0x +#define RC_CLIENT0x6000 +#define BC_CLIENT0x4000 +#define MEDIA_SUBCLIENT 0x1000 + +static u32 gen7_render_get_cmd_length_mask(u32 cmd_header) +{ + u32 client = cmd_header CLIENT_MASK; + u32 subclient = cmd_header SUBCLIENT_MASK; + + if (client == MI_CLIENT) + return 0x3F; + else if (client == RC_CLIENT) { + if (subclient == MEDIA_SUBCLIENT) + return 0x; + else + return 0xFF; + } + + DRM_DEBUG_DRIVER(CMD: Abnormal rcs cmd length! 0x%08X\n, cmd_header); + return 0; +} + +static u32 gen7_bsd_get_cmd_length_mask(u32 cmd_header) +{ + u32 client = cmd_header CLIENT_MASK; + u32 subclient = cmd_header SUBCLIENT_MASK; + + if (client == MI_CLIENT) + return 0x3F; + else if (client == RC_CLIENT) { + if (subclient == MEDIA_SUBCLIENT) + return 0xFFF; + else + return 0xFF; + } + + DRM_DEBUG_DRIVER(CMD: Abnormal bsd cmd length! 0x%08X\n, cmd_header); + return 0; +} + +static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header) +{ + u32 client = cmd_header CLIENT_MASK; + + if (client == MI_CLIENT) + return 0x3F; + else if (client == BC_CLIENT) + return 0xFF; + + DRM_DEBUG_DRIVER(CMD: Abnormal blt cmd length! 0x%08X\n, cmd_header); + return 0; +} + void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) { if (!IS_GEN7(ring-dev)) @@ -152,18 +208,24 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) ring-cmd_tables = gen7_render_cmds; ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds); } + + ring-get_cmd_length_mask = gen7_render_get_cmd_length_mask; break; case VCS: ring-cmd_tables = gen7_video_cmds; ring-cmd_table_count = ARRAY_SIZE(gen7_video_cmds); + ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; break; case BCS: ring-cmd_tables = gen7_blt_cmds; ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); + ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask; break; case VECS: ring-cmd_tables = hsw_vebox_cmds; ring-cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds); + /* VECS can use the same length_mask function as VCS */ + ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; break; default: DRM_DEBUG(CMD: cmd_parser_init with unknown ring: %d\n, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 67305d3..8e71b59 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -169,6 +169,18 @@ struct intel_ring_buffer { */ const struct drm_i915_cmd_table *cmd_tables; int cmd_table_count; + + /** +* Returns the bitmask for the length field of the specified command. +* Return 0 for an unrecognized/invalid command. +* +* If the command parser finds an entry for a command in the ring's +* cmd_tables, it gets the command's length based on the table entry. +* If not, it calls this function to determine the per-ring length field +* encoding for the command (i.e. certain opcode ranges use certain bits +* to encode the command length in the header). +*/ + u32 (*get_cmd_length_mask)(u32 cmd_header); }; static inline bool -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 15/22] drm/i915: Reject commands that would store to global HWS page
From: Brad Volkin bradley.d.vol...@intel.com PIPE_CONTROL and MI_FLUSH_DW have bits that would write to the hardware status page. There are no users of this today and it seems unsafe. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++ drivers/gpu/drm/i915/i915_reg.h| 1 + 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 7b30a03..f32dc69 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -131,7 +131,8 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { }, { .offset = 1, - .mask = PIPE_CONTROL_GLOBAL_GTT_IVB, + .mask = (PIPE_CONTROL_GLOBAL_GTT_IVB | +PIPE_CONTROL_STORE_DATA_INDEX), .expected = 0, .condition_offset = 1, .condition_mask = PIPE_CONTROL_POST_SYNC_OP_MASK @@ -167,8 +168,15 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { .expected = 0, .condition_offset = 0, .condition_mask = MI_FLUSH_DW_OP_MASK + }, + { + .offset = 0, + .mask = MI_FLUSH_DW_STORE_INDEX, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 2 ), + .bits_count = 3 ), CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, .bits = {{ .offset = 0, @@ -192,8 +200,15 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = { .expected = 0, .condition_offset = 0, .condition_mask = MI_FLUSH_DW_OP_MASK + }, + { + .offset = 0, + .mask = MI_FLUSH_DW_STORE_INDEX, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 2 ), + .bits_count = 3 ), CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, .bits = {{ .offset = 0, @@ -217,8 +232,15 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { .expected = 0, .condition_offset = 0, .condition_mask = MI_FLUSH_DW_OP_MASK + }, + { + .offset = 0, + .mask = MI_FLUSH_DW_STORE_INDEX, + .expected = 0, + .condition_offset = 0, + .condition_mask = MI_FLUSH_DW_OP_MASK }}, - .bits_count = 2 ), + .bits_count = 3 ), CMD( COLOR_BLT,S2D, !F, 0x3F, S ), CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), }; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 3f64d41..919d1a6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -323,6 +323,7 @@ #define GFX_OP_PIPE_CONTROL(len) ((0x329)|(0x327)|(0x224)|(len-2)) #define PIPE_CONTROL_GLOBAL_GTT_IVB (124) /* gen7+ */ #define PIPE_CONTROL_MMIO_WRITE (123) +#define PIPE_CONTROL_STORE_DATA_INDEX(121) #define PIPE_CONTROL_CS_STALL(120) #define PIPE_CONTROL_TLB_INVALIDATE (118) #define PIPE_CONTROL_QW_WRITE(114) -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 18/22] drm/i915: Reject MI_ARB_ON_OFF on VECS
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index c8426af..5593740 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -197,6 +197,7 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { }; static const struct drm_i915_cmd_descriptor vecs_cmds[] = { + CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 06/22] drm/i915: Add a HAS_CMD_PARSER getparam
From: Brad Volkin bradley.d.vol...@intel.com So userspace can query the kernel for command parser support. OTC-Tracker: AXIA-4631 Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ include/uapi/drm/i915_drm.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 5aeb103..f0a4638 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_EXEC_HANDLE_LUT: value = 1; break; + case I915_PARAM_HAS_CMD_PARSER: + value = 1; + break; default: DRM_DEBUG(Unknown parameter %d\n, param-param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 52aed89..48cc277 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -337,6 +337,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_EXEC_NO_RELOC25 #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 #define I915_PARAM_HAS_WT 27 +#define I915_PARAM_HAS_CMD_PARSER 28 typedef struct drm_i915_getparam { int param; -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 09/22] drm/i915: Add support for rejecting commands via bitmasks
From: Brad Volkin bradley.d.vol...@intel.com A variety of checks we want to do amount to verifying that a given bit or bits are set/clear in a given dword of a command. For now, allow a small but arbitrary number of bitmasks for each command. OTC-Tracker: AXIA-4631 Change-Id: Icc77316c243b6e218774c15e2c090cc470d59317 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 22 ++ drivers/gpu/drm/i915/i915_drv.h| 16 2 files changed, 38 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 2dbca01..99d15f3 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -400,6 +400,28 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, } } + if (desc-flags CMD_DESC_BITMASK) { + int i; + + for (i = 0; i desc-bits_count; i++) { + u32 dword = cmd[desc-bits[i].offset] + desc-bits[i].mask; + + if (dword != desc-bits[i].expected) { + DRM_DEBUG_DRIVER(CMD: Rejected command 0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n, +*cmd, +desc-bits[i].mask, +desc-bits[i].expected, +dword, ring-id); + ret = -EINVAL; + break; + } + } + + if (ret) + break; + } + cmd += length; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 83b6031..f31fc68 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1752,6 +1752,7 @@ struct drm_i915_cmd_descriptor { #define CMD_DESC_SKIP (11) #define CMD_DESC_REJECT (12) #define CMD_DESC_REGISTER (13) +#define CMD_DESC_BITMASK (14) /** * The command's unique identification bits and the bitmask to get them. @@ -1784,6 +1785,21 @@ struct drm_i915_cmd_descriptor { u32 offset; u32 mask; } reg; + +#define MAX_CMD_DESC_BITMASKS 3 + /** +* Describes command checks where a particular dword is masked and +* compared against an expected value. If the command does not match +* the expected value, the parser rejects it. Only valid if flags has +* the CMD_DESC_BITMASK bit set. +*/ + struct { + u32 offset; + u32 mask; + u32 expected; + } bits[MAX_CMD_DESC_BITMASKS]; + /** Number of valid entries in the bits array */ + int bits_count; }; /** -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 14/22] drm/i915: Enable PPGTT command parser checks
From: Brad Volkin bradley.d.vol...@intel.com Various commands that access memory have a bit to determine whether the graphics address specified in the command should use the GGTT or PPGTT for translation. These checks ensure that the bit indicates PPGTT translation. Most of these checks use the existing bit-checking infrastructure. The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function commands. The GGTT/PPGTT bit is only relevant for certain uses of the command. As such, this change also extends the bit-checking code to include a condition mask and offset. If the condition mask is non-zero then the parser only performs the bit check when the bits specified by the condition mask/offset are also non-zero. NOTE: At this point in the series PPGTT must be enabled for the parser to work correctly. If it's not enabled, userspace will not be setting the PPGTT bits the way the parser requires. There's a WARN_ON to detect this case. OTC-Tracker: AXIA-4631 Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 110 ++--- drivers/gpu/drm/i915/i915_drv.h| 6 ++ drivers/gpu/drm/i915/i915_reg.h| 5 ++ 3 files changed, 111 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index b881d39..7b30a03 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -61,15 +61,33 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_REPORT_HEAD, SMI,F, 1, S ), CMD( MI_SUSPEND_FLUSH, SMI,F, 1, S ), CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, R ), - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, .reg = { .offset = 1, .mask = 0x007C } ), CMD( MI_UPDATE_GTT,SMI, !F, 0xFF, R ), - CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W, - .reg = { .offset = 1, .mask = 0x007C } ), - CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W, - .reg = { .offset = 1, .mask = 0x007C } ), + CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W | B, + .reg = { .offset = 1, .mask = 0x007C }, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), + CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W | B, + .reg = { .offset = 1, .mask = 0x007C }, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_BATCH_BUFFER_START,SMI, !F, 0xFF, S ), }; @@ -79,7 +97,20 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( MI_PREDICATE, SMI,F, 1, S ), CMD( MI_TOPOLOGY_FILTER, SMI,F, 1, S ), - CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), + CMD( MI_CLFLUSH, SMI, !F, 0x3FF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), + CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( GFX_OP_3DSTATE_VF_STATISTICS, S3D,F, 1, S ), CMD( PIPELINE_SELECT, S3D,F, 1, S ), CMD( MEDIA_VFE_STATE, S3D, !F, 0x, B, @@ -97,8 +128,15 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { .offset = 1,
[Intel-gfx] [RFC 07/22] drm/i915: Add support for rejecting commands during parsing
From: Brad Volkin bradley.d.vol...@intel.com Certain commands are always disallowed from userspace. This adds the ability for the command parser to detect such commands and reject batch buffers containing them. OTC-Tracker: AXIA-4631 Change-Id: I000b0df4d441ec80b607a50d35e83418cdfd38b3 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 6 ++ drivers/gpu/drm/i915/i915_drv.h| 6 -- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index b01628e..c64f640 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -368,6 +368,12 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, break; } + if (desc-flags CMD_DESC_REJECT) { + DRM_DEBUG_DRIVER(CMD: Rejected command: 0x%08X\n, *cmd); + ret = -EINVAL; + break; + } + cmd += length; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 81ef047..6ace856 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1743,10 +1743,12 @@ struct drm_i915_cmd_descriptor { * CMD_DESC_SKIP: The command is allowed but does not follow the *standard length encoding for the opcode range in *which it falls +* CMD_DESC_REJECT: The command is never allowed */ u32 flags; -#define CMD_DESC_FIXED (10) -#define CMD_DESC_SKIP (11) +#define CMD_DESC_FIXED (10) +#define CMD_DESC_SKIP (11) +#define CMD_DESC_REJECT (12) /** * The command's unique identification bits and the bitmask to get them. -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 10/22] drm/i915: Reject unsafe commands
From: Brad Volkin bradley.d.vol...@intel.com These commands allow userspace to affect global state. OTC-Tracker: AXIA-4631 Change-Id: I80a22c9cd83181790d2a9064e70ea09326691b66 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 99d15f3..8ee4cda 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -47,6 +47,7 @@ #define SMFX STD_MFX_OPCODE_MASK #define F CMD_DESC_FIXED #define S CMD_DESC_SKIP +#define R CMD_DESC_REJECT /*Command Mask Fixed Len Action -- */ @@ -57,10 +58,11 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_ARB_CHECK, SMI,F, 1, S ), CMD( MI_REPORT_HEAD, SMI,F, 1, S ), CMD( MI_SUSPEND_FLUSH, SMI,F, 1, S ), - CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, S ), + CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, R ), CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), - CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, S ), + CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, S ), + CMD( MI_UPDATE_GTT,SMI, !F, 0xFF, R ), CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, S ), CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, S ), CMD( MI_BATCH_BUFFER_START,SMI, !F, 0xFF, S ), @@ -68,8 +70,8 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { static const struct drm_i915_cmd_descriptor render_cmds[] = { CMD( MI_FLUSH, SMI,F, 1, S ), - CMD( MI_ARB_ON_OFF,SMI,F, 1, S ), - CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, S ), + CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), + CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( MI_PREDICATE, SMI,F, 1, S ), CMD( MI_TOPOLOGY_FILTER, SMI,F, 1, S ), CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), @@ -94,12 +96,12 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { }; static const struct drm_i915_cmd_descriptor video_cmds[] = { - CMD( MI_ARB_ON_OFF,SMI,F, 1, S ), + CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), CMD( MFX_WAIT, SMFX, !F, 0x3F, S ), }; static const struct drm_i915_cmd_descriptor blt_cmds[] = { - CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, S ), + CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( COLOR_BLT,S2D, !F, 0x3F, S ), CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), }; @@ -111,6 +113,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { #undef SMFX #undef F #undef S +#undef R static const struct drm_i915_cmd_table gen7_render_cmds[] = { { common_cmds, ARRAY_SIZE(common_cmds) }, -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 22/22] drm/i915: Enable command parsing by default
From: Brad Volkin bradley.d.vol...@intel.com OTC-Tracker: AXIA-4631 Change-Id: I6747457e1fe7494bd42787af51198fcba398ad78 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 90d7db0..8c0d91b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -154,10 +154,10 @@ module_param_named(prefault_disable, i915_prefault_disable, bool, 0600); MODULE_PARM_DESC(prefault_disable, Disable page prefaulting for pread/pwrite/reloc (default:false). For developers only.); -int i915_enable_cmd_parser __read_mostly = 0; +int i915_enable_cmd_parser __read_mostly = 1; module_param_named(enable_cmd_parser, i915_enable_cmd_parser, int, 0600); MODULE_PARM_DESC(enable_cmd_parser, - Enable command parsing (default: false)); + Enable command parsing (default: true)); static struct drm_driver driver; -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 21/22] drm/i915: Clean up command parser enable decision
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 30 +++--- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- 3 files changed, 21 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 8481ef0..b3525ce 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -581,6 +581,25 @@ finish: return (u32*)addr; } +int i915_needs_cmd_parser(struct intel_ring_buffer *ring) +{ + drm_i915_private_t *dev_priv = + (drm_i915_private_t *)ring-dev-dev_private; + + /* No command tables indicates a platform without parsing */ + if (!ring-cmd_tables) + return 0; + + /* XXX: VLV is Gen7 and therefore has cmd_tables, but has PPGTT +* disabled. That will cause all of the parser's PPGTT checks to +* fail. For now, disable parsing when PPGTT is off. +*/ + if(!dev_priv-mm.aliasing_ppgtt) + return 0; + + return i915_enable_cmd_parser; +} + #define LENGTH_BIAS 2 int i915_parse_cmds(struct intel_ring_buffer *ring, @@ -590,17 +609,6 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, int ret = 0; u32 *cmd, *batch_base, *batch_end; struct drm_i915_cmd_descriptor default_desc = { 0 }; - drm_i915_private_t *dev_priv = - (drm_i915_private_t *)ring-dev-dev_private; - - /* XXX: this breaks VLV, which is Gen7, but no PPGTT -* Replace with better checks for when to call i915_parse_cmds? -*/ - WARN_ON(!dev_priv-mm.aliasing_ppgtt); - - /* No command tables currently indicates a platform without parsing */ - if (!ring-cmd_tables) - return 0; batch_base = vmap_batch(batch_obj); if (!batch_base) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 161d9cd..e7fc31c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2412,6 +2412,7 @@ const char *i915_cache_level_str(int type); /* i915_cmd_parser.c */ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring); +int i915_needs_cmd_parser(struct intel_ring_buffer *ring); int i915_parse_cmds(struct intel_ring_buffer *ring, struct drm_i915_gem_object *batch_obj, u32 batch_start_offset); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 06975c7..7b1453e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1143,7 +1143,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, } batch_obj-base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND; - if (i915_enable_cmd_parser !(flags I915_DISPATCH_SECURE)) { + if (i915_needs_cmd_parser(ring) !(flags I915_DISPATCH_SECURE)) { ret = i915_parse_cmds(ring, batch_obj, args-batch_start_offset); -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 11/22] drm/i915: Add register whitelists for mesa
From: Brad Volkin bradley.d.vol...@intel.com These registers are currently used by mesa for blitting and for transform feedback extensions. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 33 + drivers/gpu/drm/i915/i915_reg.h| 9 + 2 files changed, 42 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 8ee4cda..1decff9 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -140,6 +140,34 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = { { blt_cmds, ARRAY_SIZE(blt_cmds) }, }; +/* Register whitelists, sorted by increasing register offset. + * + * Some registers that userspace accesses are 64 bits. The register + * access commands only allow 32-bit accesses. Hence, we have to include + * entries for both halves of the 64-bit registers. + */ + +static const u32 gen7_render_regs[] = { + CL_INVOCATION_COUNT, + CL_INVOCATION_COUNT + sizeof(u32), + GEN7_SO_NUM_PRIMS_WRITTEN(0), + GEN7_SO_NUM_PRIMS_WRITTEN(0) + sizeof(u32), + GEN7_SO_NUM_PRIMS_WRITTEN(1), + GEN7_SO_NUM_PRIMS_WRITTEN(1) + sizeof(u32), + GEN7_SO_NUM_PRIMS_WRITTEN(2), + GEN7_SO_NUM_PRIMS_WRITTEN(2) + sizeof(u32), + GEN7_SO_NUM_PRIMS_WRITTEN(3), + GEN7_SO_NUM_PRIMS_WRITTEN(3) + sizeof(u32), + GEN7_SO_WRITE_OFFSET(0), + GEN7_SO_WRITE_OFFSET(1), + GEN7_SO_WRITE_OFFSET(2), + GEN7_SO_WRITE_OFFSET(3), +}; + +static const u32 gen7_blt_regs[] = { + BCS_SWCTRL, +}; + #define CLIENT_MASK 0xE000 #define SUBCLIENT_MASK 0x1800 #define MI_CLIENT0x @@ -212,6 +240,9 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds); } + ring-reg_table = gen7_render_regs; + ring-reg_count = ARRAY_SIZE(gen7_render_regs); + ring-get_cmd_length_mask = gen7_render_get_cmd_length_mask; break; case VCS: @@ -222,6 +253,8 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) case BCS: ring-cmd_tables = gen7_blt_cmds; ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); + ring-reg_table = gen7_blt_regs; + ring-reg_count = ARRAY_SIZE(gen7_blt_regs); ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask; break; case VECS: diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 5d52569..aa43624 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -371,6 +371,15 @@ #define SRC_COPY_BLT ((0x229)|(0x4322)) /* + * Registers used only by the command parser + */ +#define BCS_SWCTRL 0x22200 + +#define CL_INVOCATION_COUNT 0x2338 +/* There are the 4 64-bit counter registers, one for each stream output */ +#define GEN7_SO_NUM_PRIMS_WRITTEN(n) (0x5200 + (n) * 8) + +/* * Reset registers */ #define DEBUG_RESET_I830 0x6070 -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 16/22] drm/i915: Reject additional commands
From: Brad Volkin bradley.d.vol...@intel.com MI_USER_INTERRUPT: We're rejecting other interrupt-generating mechanisms MI_LOAD_SCAN_LINES_*: The DDX is the only user of these and protects them behind the I915_EXEC_SECURE flag, so we probably shouldn't let others use these. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 26 +++--- drivers/gpu/drm/i915/i915_reg.h| 29 +++-- 2 files changed, 38 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index f32dc69..3ad2a1e 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -55,7 +55,7 @@ -- */ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_NOOP, SMI,F, 1, S ), - CMD( MI_USER_INTERRUPT,SMI,F, 1, S ), + CMD( MI_USER_INTERRUPT,SMI,F, 1, R ), CMD( MI_WAIT_FOR_EVENT,SMI,F, 1, S ), CMD( MI_ARB_CHECK, SMI,F, 1, S ), CMD( MI_REPORT_HEAD, SMI,F, 1, S ), @@ -145,6 +145,8 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { CMD( MI_RS_CONTROL,SMI,F, 1, S ), CMD( MI_URB_ATOMIC_ALLOC, SMI,F, 1, S ), CMD( MI_RS_CONTEXT,SMI,F, 1, S ), + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), + CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), CMD( MI_MATH, SMI, !F, 0x3F, S ), CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W, .reg = { .offset = 1, .mask = 0x007C } ), @@ -245,6 +247,11 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), }; +static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = { + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), + CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), +}; + #undef CMD #undef SMI #undef S3D @@ -282,6 +289,12 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = { { blt_cmds, ARRAY_SIZE(blt_cmds) }, }; +static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = { + { common_cmds, ARRAY_SIZE(common_cmds) }, + { blt_cmds, ARRAY_SIZE(blt_cmds) }, + { hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) }, +}; + /* Register whitelists, sorted by increasing register offset. * * Some registers that userspace accesses are 64 bits. The register @@ -393,10 +406,17 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; break; case BCS: - ring-cmd_tables = gen7_blt_cmds; - ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); + if (IS_HASWELL(ring-dev)) { + ring-cmd_tables = hsw_blt_ring_cmds; + ring-cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds); + } else { + ring-cmd_tables = gen7_blt_cmds; + ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); + } + ring-reg_table = gen7_blt_regs; ring-reg_count = ARRAY_SIZE(gen7_blt_regs); + ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask; break; case VECS: diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 919d1a6..232ad0c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -345,20 +345,21 @@ /* * Commands used only by the command parser */ -#define MI_SET_PREDICATE MI_INSTR(0x01, 0) -#define MI_ARB_CHECK MI_INSTR(0x05, 0) -#define MI_RS_CONTROL MI_INSTR(0x06, 0) -#define MI_URB_ATOMIC_ALLOCMI_INSTR(0x09, 0) -#define MI_PREDICATE MI_INSTR(0x0C, 0) -#define MI_RS_CONTEXT MI_INSTR(0x0F, 0) -#define MI_TOPOLOGY_FILTER MI_INSTR(0x0D, 0) -#define MI_MATHMI_INSTR(0x1A, 0) -#define MI_UPDATE_GTT MI_INSTR(0x23, 0) -#define MI_CLFLUSH MI_INSTR(0x27, 0) -#define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 0) -#define MI_LOAD_REGISTER_REG MI_INSTR(0x2A, 0) -#define MI_LOAD_URB_MEMMI_INSTR(0x2C, 0) -#define MI_STORE_URB_MEM MI_INSTR(0x2D, 0) +#define MI_SET_PREDICATEMI_INSTR(0x01, 0) +#define MI_ARB_CHECKMI_INSTR(0x05, 0) +#define MI_RS_CONTROL MI_INSTR(0x06, 0) +#define MI_URB_ATOMIC_ALLOC MI_INSTR(0x09, 0) +#define
[Intel-gfx] [RFC 12/22] drm/i915: Enable register whitelist checks
From: Brad Volkin bradley.d.vol...@intel.com MI_STORE_REGISTER_MEM and MI_LOAD_REGISTER_* commands allow userspace access to registers. Only certain registers should be allowed for such access, so enable checking for those commands. OTC-Tracker: AXIA-4631 Change-Id: Ie614a2f0eb2e5917de809e5a17957175d24cc44f Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 1decff9..df5424b 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -48,6 +48,7 @@ #define F CMD_DESC_FIXED #define S CMD_DESC_SKIP #define R CMD_DESC_REJECT +#define W CMD_DESC_REGISTER /*Command Mask Fixed Len Action -- */ @@ -61,10 +62,13 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, R ), CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), - CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, S ), + CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007C } ), CMD( MI_UPDATE_GTT,SMI, !F, 0xFF, R ), - CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, S ), - CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, S ), + CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007C } ), + CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007C } ), CMD( MI_BATCH_BUFFER_START,SMI, !F, 0xFF, S ), }; @@ -88,7 +92,8 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { CMD( MI_URB_ATOMIC_ALLOC, SMI,F, 1, S ), CMD( MI_RS_CONTEXT,SMI,F, 1, S ), CMD( MI_MATH, SMI, !F, 0x3F, S ), - CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, S ), + CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W, + .reg = { .offset = 1, .mask = 0x007C } ), CMD( MI_LOAD_URB_MEM, SMI, !F, 0xFF, S ), CMD( MI_STORE_URB_MEM, SMI, !F, 0xFF, S ), CMD( GFX_OP_3DSTATE_DX9_CONSTANTF_VS, S3D, !F, 0x7FF, S ), @@ -114,6 +119,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { #undef F #undef S #undef R +#undef W static const struct drm_i915_cmd_table gen7_render_cmds[] = { { common_cmds, ARRAY_SIZE(common_cmds) }, -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. At this point there are a couple of known issues and potential improvements. 1) VLV. The parser is currently disabled for VLV. One type of check performed by the parser is that commands which access memory do so via PPGTT. VLV does not have PPGTT enabled at this time. I chose to implement the PPGTT checks via generic bit checking infrastructure in the parser, so they are not easily disabled for VLV. For now, I'm disabling parsing altogether in the hope that PPGTT can be enabled for VLV in the near future. 2) Coherency. I've found two types of coherency issues when reading the batch buffer from the CPU during execbuffer2. Looking for help with both issues. i. First, the i-g-t test gem_cpu_reloc blits to a batch buffer and the parser isn't properly waiting for the operation to complete before parsing. I tried adding i915_gem_object_sync(batch_obj, [ring|NULL]) but that actually caused more failures. ii. Second, on VLV, I've seen cache coherency issues when userspace writes the batch via pwrite fast path before calling execbuffer2. The parser reads stale data. This works fine on IVB and HSW, so I believe it's an LLC vs. non-LLC issue. I'm just unclear on what the correct flushing or synchronization is for this scenario. 3) 2nd-level batches. The parser currently allows MI_BATCH_BUFFER_START commands in userspace batches without parsing them. The media driver uses 2nd-level batches, so a solution is required. I have some ideas but don't want to delay the review process for what I have so far. It may be that the 2nd-level parsing is only needed for VCS and the current code (plus rejecting BBS) would be sufficient for RCS. 4) Command buffer copy. To avoid CPU modifications to buffers after parsing, and to avoid GPU modifications to buffers via EUs or commands in the batch, we should copy the userspace batch buffer to memory that userspace does not have access to, map it into GGTT, and execute that batch buffer. I have a sense of how to do this for 1st-level batches, but it would need changes to tie in with the 2nd-level batch parsing I think, so I've again held off. Brad Volkin (22): drm/i915: Add data structures for command parser drm/i915: Initial command parser table definitions drm/i915: Hook command parser tables up to rings drm/i915: Add per-ring command length decode functions drm/i915: Implement command parsing drm/i915: Add a HAS_CMD_PARSER getparam drm/i915: Add support for
[Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing
From: Brad Volkin bradley.d.vol...@intel.com At this point the parser just implements the mechanics of finding commands in the batch buffer and looking them up in the parser tables. It rejects poorly formed batches (e.g. no MI_BATCH_BUFFER_END command, containing unrecognized commands, etc) but does no other checks. Optionally enabled by a module parameter, currently defaulting to off. We'll enable by default at the end of the series. The code to look up commands in the per-ring tables implements a linear search. The tables are small so in practice this does not appear to cause a performance issue. However, the tables are sorted by command opcode such that we can move to something like binary search if this code becomes a bottleneck in the future. OTC-Tracker: AXIA-4631 Change-Id: If71f50d28578d27325c3438abf4b229c7cf695cd Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 148 + drivers/gpu/drm/i915/i915_drv.c| 5 + drivers/gpu/drm/i915/i915_drv.h| 4 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++ 4 files changed, 172 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 247d530..b01628e 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -232,3 +232,151 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) ring-id); } } + +static const struct drm_i915_cmd_descriptor* +find_cmd_in_table(const struct drm_i915_cmd_table *table, + u32 cmd_header) +{ + int i; + + for (i = 0; i table-count; i++) { + const struct drm_i915_cmd_descriptor *desc = table-table[i]; + u32 masked_cmd = desc-cmd.mask cmd_header; + u32 masked_value = desc-cmd.value desc-cmd.mask; + + if (masked_cmd == masked_value) + return desc; + } + + return NULL; +} + +/* Returns a pointer to a descriptor for the command specified by cmd_header. + * + * The caller must supply space for a default descriptor via the default_desc + * parameter. If no descriptor for the specified command exists in the ring's + * command parser tables, this function fills in default_desc based on the + * ring's default length encoding and returns default_desc. + */ +static const struct drm_i915_cmd_descriptor* +find_cmd(struct intel_ring_buffer *ring, +u32 cmd_header, +struct drm_i915_cmd_descriptor *default_desc) +{ + u32 mask; + int i; + + for (i = 0; i ring-cmd_table_count; i++) { + const struct drm_i915_cmd_descriptor *desc; + + desc = find_cmd_in_table(ring-cmd_tables[i], cmd_header); + if (desc) + return desc; + } + + mask = ring-get_cmd_length_mask(cmd_header); + if (!mask) + return NULL; + + BUG_ON(!default_desc); + default_desc-flags = CMD_DESC_SKIP; + default_desc-length.mask = mask; + + return default_desc; +} + +static u32 *vmap_batch(struct drm_i915_gem_object *obj) +{ + int i; + void *addr = NULL; + struct sg_page_iter sg_iter; + struct page **pages; + + pages = drm_malloc_ab(obj-base.size PAGE_SHIFT, sizeof(*pages)); + if (pages == NULL) { + DRM_DEBUG(Failed to get space for pages\n); + goto finish; + } + + i = 0; + for_each_sg_page(obj-pages-sgl, sg_iter, obj-pages-nents, 0) { + pages[i] = sg_page_iter_page(sg_iter); + i++; + } + + addr = vmap(pages, i, 0, PAGE_KERNEL); + if (addr == NULL) { + DRM_DEBUG(Failed to vmap pages\n); + goto finish; + } + +finish: + if (pages) + drm_free_large(pages); + return (u32*)addr; +} + +#define LENGTH_BIAS 2 + +int i915_parse_cmds(struct intel_ring_buffer *ring, + struct drm_i915_gem_object *batch_obj, + u32 batch_start_offset) +{ + int ret = 0; + u32 *cmd, *batch_base, *batch_end; + struct drm_i915_cmd_descriptor default_desc = { 0 }; + + /* No command tables currently indicates a platform without parsing */ + if (!ring-cmd_tables) + return 0; + + batch_base = vmap_batch(batch_obj); + if (!batch_base) { + DRM_DEBUG_DRIVER(CMD: Failed to vmap batch\n); + return -ENOMEM; + } + + cmd = batch_base + (batch_start_offset / sizeof(*cmd)); + batch_end = cmd + (batch_obj-base.size / sizeof(*batch_end)); + + while (cmd batch_end) { + const struct drm_i915_cmd_descriptor *desc; + u32 length; + + if (*cmd == MI_BATCH_BUFFER_END) + break; + + desc = find_cmd(ring, *cmd,
[Intel-gfx] [RFC 02/22] drm/i915: Initial command parser table definitions
From: Brad Volkin bradley.d.vol...@intel.com Add command tables defining irregular length commands for each ring. This requires a few new command opcode definitions. OTC-Tracker: AXIA-4631 Change-Id: I064bceb457e15f46928058352afe76d918c58ef5 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_cmd_parser.c | 138 + drivers/gpu/drm/i915/i915_reg.h| 34 3 files changed, 174 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 41838ea..7b7450b 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -47,7 +47,8 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o \ dvo_tfp410.o \ dvo_sil164.o \ dvo_ns2501.o \ - i915_gem_dmabuf.o + i915_gem_dmabuf.o \ + i915_cmd_parser.o i915-$(CONFIG_COMPAT) += i915_ioc32.o diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c new file mode 100644 index 000..deb77c8c --- /dev/null +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -0,0 +1,138 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Brad Volkin bradley.d.vol...@intel.com + * + */ + +#include i915_drv.h + +#define STD_MI_OPCODE_MASK 0xFF80 +#define STD_3D_OPCODE_MASK 0x +#define STD_2D_OPCODE_MASK 0xFFC0 +#define STD_MFX_OPCODE_MASK 0x + +#define CMD(op, opm, f, lm, fl, ...) \ + { \ + .flags = (fl) | (f),\ + .cmd = { (op), (opm) }, \ + .length = { (lm) }, \ + __VA_ARGS__ \ + } + +/* Convenience macros to compress the tables */ +#define SMI STD_MI_OPCODE_MASK +#define S3D STD_3D_OPCODE_MASK +#define S2D STD_2D_OPCODE_MASK +#define SMFX STD_MFX_OPCODE_MASK +#define F CMD_DESC_FIXED +#define S CMD_DESC_SKIP + +/*Command Mask Fixed Len Action + -- */ +static const struct drm_i915_cmd_descriptor common_cmds[] = { + CMD( MI_NOOP, SMI,F, 1, S ), + CMD( MI_USER_INTERRUPT,SMI,F, 1, S ), + CMD( MI_WAIT_FOR_EVENT,SMI,F, 1, S ), + CMD( MI_ARB_CHECK, SMI,F, 1, S ), + CMD( MI_REPORT_HEAD, SMI,F, 1, S ), + CMD( MI_SUSPEND_FLUSH, SMI,F, 1, S ), + CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), + CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, S ), + CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, S ), + CMD( MI_STORE_REGISTER_MEM(1), SMI, !F, 0xFF, S ), + CMD( MI_LOAD_REGISTER_MEM, SMI, !F, 0xFF, S ), + CMD( MI_BATCH_BUFFER_START,SMI, !F, 0xFF, S ), +}; + +static const struct drm_i915_cmd_descriptor render_cmds[] = { + CMD( MI_FLUSH, SMI,F, 1, S ), + CMD( MI_ARB_ON_OFF,SMI,F, 1, S ), + CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, S ), + CMD( MI_PREDICATE, SMI,F, 1, S ), + CMD( MI_TOPOLOGY_FILTER, SMI,F, 1, S ), + CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), + CMD( GFX_OP_3DSTATE_VF_STATISTICS, S3D,F, 1, S ), +
[Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion
From: Brad Volkin bradley.d.vol...@intel.com The length mask is different for each ring and the size can vary, so we should duplicate the definition with the correct encoding for each ring. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 35 +++--- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index adc7d94..8481ef0 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -61,13 +61,6 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = { CMD( MI_REPORT_HEAD, SMI,F, 1, S ), CMD( MI_SUSPEND_FLUSH, SMI,F, 1, S ), CMD( MI_SEMAPHORE_MBOX,SMI, !F, 0xFF, R ), - CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, - .bits = {{ - .offset = 0, - .mask = MI_GLOBAL_GTT, - .expected = 0 - }}, - .bits_count = 1 ), CMD( MI_STORE_DWORD_INDEX, SMI, !F, 0xFF, R ), CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, .reg = { .offset = 1, .mask = 0x007C } ), @@ -97,6 +90,13 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), CMD( MI_PREDICATE, SMI,F, 1, S ), CMD( MI_TOPOLOGY_FILTER, SMI,F, 1, S ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_CLFLUSH, SMI, !F, 0x3FF, B, .bits = {{ .offset = 0, @@ -165,6 +165,13 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { static const struct drm_i915_cmd_descriptor video_cmds[] = { CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, @@ -202,6 +209,13 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { static const struct drm_i915_cmd_descriptor vecs_cmds[] = { CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, @@ -234,6 +248,13 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = { static const struct drm_i915_cmd_descriptor blt_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x1FF, B, + .bits = {{ + .offset = 0, + .mask = MI_GLOBAL_GTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, .bits = {{ .offset = 0, -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 17/22] drm/i915: Add parser data for perf monitoring GL extensions
From: Brad Volkin bradley.d.vol...@intel.com These registers and commands will be used by mesa for the GL_AMD_performance_monitor extension. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 27 +++ drivers/gpu/drm/i915/i915_reg.h| 13 + 2 files changed, 40 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 3ad2a1e..c8426af 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -104,6 +104,13 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { .expected = 0 }}, .bits_count = 1 ), + CMD( MI_REPORT_PERF_COUNT, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 1, + .mask = MI_REPORT_PERF_COUNT_GGTT, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MI_CONDITIONAL_BATCH_BUFFER_END, SMI, !F, 0xFF, B, .bits = {{ .offset = 0, @@ -303,8 +310,28 @@ static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = { */ static const u32 gen7_render_regs[] = { + HS_INVOCATION_COUNT, + HS_INVOCATION_COUNT + sizeof(u32), + DS_INVOCATION_COUNT, + DS_INVOCATION_COUNT + sizeof(u32), + IA_VERTICES_COUNT, + IA_VERTICES_COUNT + sizeof(u32), + IA_PRIMITIVES_COUNT, + IA_PRIMITIVES_COUNT + sizeof(u32), + VS_INVOCATION_COUNT, + VS_INVOCATION_COUNT + sizeof(u32), + GS_INVOCATION_COUNT, + GS_INVOCATION_COUNT + sizeof(u32), + GS_PRIMITIVES_COUNT, + GS_PRIMITIVES_COUNT + sizeof(u32), CL_INVOCATION_COUNT, CL_INVOCATION_COUNT + sizeof(u32), + CL_PRIMITIVES_COUNT, + CL_PRIMITIVES_COUNT + sizeof(u32), + PS_INVOCATION_COUNT, + PS_INVOCATION_COUNT + sizeof(u32), + PS_DEPTH_COUNT, + PS_DEPTH_COUNT + sizeof(u32), GEN7_SO_NUM_PRIMS_WRITTEN(0), GEN7_SO_NUM_PRIMS_WRITTEN(0) + sizeof(u32), GEN7_SO_NUM_PRIMS_WRITTEN(1), diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 232ad0c..4dd5541 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -356,6 +356,8 @@ #define MI_MATH MI_INSTR(0x1A, 0) #define MI_UPDATE_GTT MI_INSTR(0x23, 0) #define MI_CLFLUSH MI_INSTR(0x27, 0) +#define MI_REPORT_PERF_COUNTMI_INSTR(0x28, 0) +#define MI_REPORT_PERF_COUNT_GGTT (10) #define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 0) #define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 0) #define MI_LOAD_URB_MEM MI_INSTR(0x2C, 0) @@ -385,7 +387,18 @@ */ #define BCS_SWCTRL 0x22200 +#define HS_INVOCATION_COUNT 0x2300 +#define DS_INVOCATION_COUNT 0x2308 +#define IA_VERTICES_COUNT 0x2310 +#define IA_PRIMITIVES_COUNT 0x2318 +#define VS_INVOCATION_COUNT 0x2320 +#define GS_INVOCATION_COUNT 0x2328 +#define GS_PRIMITIVES_COUNT 0x2330 #define CL_INVOCATION_COUNT 0x2338 +#define CL_PRIMITIVES_COUNT 0x2340 +#define PS_INVOCATION_COUNT 0x2348 +#define PS_DEPTH_COUNT 0x2350 + /* There are the 4 64-bit counter registers, one for each stream output */ #define GEN7_SO_NUM_PRIMS_WRITTEN(n) (0x5200 + (n) * 8) -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 08/22] drm/i915: Add support for checking register accesses
From: Brad Volkin bradley.d.vol...@intel.com Some OpenGL/media features require userspace to perform register accesses from batch buffers on Gen hardware. To enable this, each ring gets a whitelist of registers that userspace may access from a batch buffer. With this patch, no whitelists are defined, so no access is allowed. OTC-Tracker: AXIA-4631 Change-Id: Ibf607ec3b1df0076f6acbd9aee34a2ee48cfac6b Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 26 ++ drivers/gpu/drm/i915/i915_drv.h | 19 --- drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++ 3 files changed, 48 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index c64f640..2dbca01 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -285,6 +285,20 @@ find_cmd(struct intel_ring_buffer *ring, return default_desc; } +static int valid_reg(const u32 *table, int count, u32 addr) +{ + if (table count != 0) { + int i; + + for (i = 0; i count; i++) { + if (table[i] == addr) + return 1; + } + } + + return 0; +} + static u32 *vmap_batch(struct drm_i915_gem_object *obj) { int i; @@ -374,6 +388,18 @@ int i915_parse_cmds(struct intel_ring_buffer *ring, break; } + if (desc-flags CMD_DESC_REGISTER) { + u32 reg_addr = cmd[desc-reg.offset] desc-reg.mask; + + if (!valid_reg(ring-reg_table, + ring-reg_count, reg_addr)) { + DRM_DEBUG_DRIVER(CMD: Rejected register 0x%08X in command: 0x%08X (ring=%d)\n, +reg_addr, *cmd, ring-id); + ret = -EINVAL; + break; + } + } + cmd += length; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6ace856..83b6031 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1744,11 +1744,14 @@ struct drm_i915_cmd_descriptor { *standard length encoding for the opcode range in *which it falls * CMD_DESC_REJECT: The command is never allowed +* CMD_DESC_REGISTER: The command should be checked against the +*register whitelist for the appropriate ring */ u32 flags; -#define CMD_DESC_FIXED (10) -#define CMD_DESC_SKIP (11) -#define CMD_DESC_REJECT (12) +#define CMD_DESC_FIXED(10) +#define CMD_DESC_SKIP (11) +#define CMD_DESC_REJECT (12) +#define CMD_DESC_REGISTER (13) /** * The command's unique identification bits and the bitmask to get them. @@ -1771,6 +1774,16 @@ struct drm_i915_cmd_descriptor { u32 fixed; u32 mask; } length; + + /** +* Describes where to find a register address in the command to check +* against the ring's register whitelist. Only valid if flags has the +* CMD_DESC_REGISTER bit set. +*/ + struct { + u32 offset; + u32 mask; + } reg; }; /** diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 8e71b59..b898105a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -171,6 +171,12 @@ struct intel_ring_buffer { int cmd_table_count; /** +* Table of registers allowed in commands that read/write registers. +*/ + const u32 *reg_table; + int reg_count; + + /** * Returns the bitmask for the length field of the specified command. * Return 0 for an unrecognized/invalid command. * -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 13/22] drm/i915: Enable bit checking for some commands
From: Brad Volkin bradley.d.vol...@intel.com These checks prevent userspace from using certain commands to access registers or generate interrupts. OTC-Tracker: AXIA-4631 Change-Id: Ic6367ae98272495ba874c22abd4824fbced0abca Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 41 ++ drivers/gpu/drm/i915/i915_reg.h| 3 +++ 2 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index df5424b..b881d39 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -49,6 +49,7 @@ #define S CMD_DESC_SKIP #define R CMD_DESC_REJECT #define W CMD_DESC_REGISTER +#define B CMD_DESC_BITMASK /*Command Mask Fixed Len Action -- */ @@ -81,9 +82,23 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = { CMD( MI_CLFLUSH, SMI, !F, 0x3FF, S ), CMD( GFX_OP_3DSTATE_VF_STATISTICS, S3D,F, 1, S ), CMD( PIPELINE_SELECT, S3D,F, 1, S ), + CMD( MEDIA_VFE_STATE, S3D, !F, 0x, B, + .bits = {{ + .offset = 2, + .mask = MEDIA_VFE_STATE_MMIO_ACCESS_MASK, + .expected = 0 + }}, + .bits_count = 1 ), CMD( GPGPU_OBJECT, S3D, !F, 0xFF, S ), CMD( GPGPU_WALKER, S3D, !F, 0xFF, S ), CMD( GFX_OP_3DSTATE_SO_DECL_LIST, S3D, !F, 0x1FF, S ), + CMD( GFX_OP_PIPE_CONTROL(5), S3D, !F, 0xFF, B, + .bits = {{ + .offset = 1, + .mask = (PIPE_CONTROL_MMIO_WRITE | PIPE_CONTROL_NOTIFY), + .expected = 0 + }}, + .bits_count = 1 ), }; static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { @@ -102,11 +117,35 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = { static const struct drm_i915_cmd_descriptor video_cmds[] = { CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), + CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 0, + .mask = MI_FLUSH_DW_NOTIFY, + .expected = 0 + }}, + .bits_count = 1 ), CMD( MFX_WAIT, SMFX, !F, 0x3F, S ), }; +static const struct drm_i915_cmd_descriptor vecs_cmds[] = { + CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 0, + .mask = MI_FLUSH_DW_NOTIFY, + .expected = 0 + }}, + .bits_count = 1 ), +}; + static const struct drm_i915_cmd_descriptor blt_cmds[] = { CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), + CMD( MI_FLUSH_DW, SMI, !F, 0x3F, B, + .bits = {{ + .offset = 0, + .mask = MI_FLUSH_DW_NOTIFY, + .expected = 0 + }}, + .bits_count = 1 ), CMD( COLOR_BLT,S2D, !F, 0x3F, S ), CMD( SRC_COPY_BLT, S2D, !F, 0x3F, S ), }; @@ -120,6 +159,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = { #undef S #undef R #undef W +#undef B static const struct drm_i915_cmd_table gen7_render_cmds[] = { { common_cmds, ARRAY_SIZE(common_cmds) }, @@ -139,6 +179,7 @@ static const struct drm_i915_cmd_table gen7_video_cmds[] = { static const struct drm_i915_cmd_table hsw_vebox_cmds[] = { { common_cmds, ARRAY_SIZE(common_cmds) }, + { vecs_cmds, ARRAY_SIZE(vecs_cmds) }, }; static const struct drm_i915_cmd_table gen7_blt_cmds[] = { diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index aa43624..0e504b9 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -240,6 +240,7 @@ #define MI_FLUSH_DW_STORE_INDEX (121) #define MI_INVALIDATE_TLB(118) #define MI_FLUSH_DW_OP_STOREDW (114) +#define MI_FLUSH_DW_NOTIFY (18) #define MI_INVALIDATE_BSD(17) #define MI_FLUSH_DW_USE_GTT (12) #define MI_FLUSH_DW_USE_PPGTT(02) @@ -318,6 +319,7 @@ #define DISPLAY_PLANE_B (120) #define GFX_OP_PIPE_CONTROL(len)
[Intel-gfx] [RFC 03/22] drm/i915: Hook command parser tables up to rings
From: Brad Volkin bradley.d.vol...@intel.com OTC-Tracker: AXIA-4631 Change-Id: Id178f67338d00c23ca4e8fc9499313dba93d2c5c Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 34 + drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +++ 4 files changed, 46 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index deb77c8c..014e661 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -136,3 +136,37 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = { { common_cmds, ARRAY_SIZE(common_cmds) }, { blt_cmds, ARRAY_SIZE(blt_cmds) }, }; + +void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring) +{ + if (!IS_GEN7(ring-dev)) + return; + + switch (ring-id) { + case RCS: + if (IS_HASWELL(ring-dev)) { + ring-cmd_tables = hsw_render_ring_cmds; + ring-cmd_table_count = + ARRAY_SIZE(hsw_render_ring_cmds); + } else { + ring-cmd_tables = gen7_render_cmds; + ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds); + } + break; + case VCS: + ring-cmd_tables = gen7_video_cmds; + ring-cmd_table_count = ARRAY_SIZE(gen7_video_cmds); + break; + case BCS: + ring-cmd_tables = gen7_blt_cmds; + ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); + break; + case VECS: + ring-cmd_tables = hsw_vebox_cmds; + ring-cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds); + break; + default: + DRM_DEBUG(CMD: cmd_parser_init with unknown ring: %d\n, + ring-id); + } +} diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ff1e201..7de4e59 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2372,6 +2372,9 @@ void i915_destroy_error_state(struct drm_device *dev); void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone); const char *i915_cache_level_str(int type); +/* i915_cmd_parser.c */ +void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring); + /* i915_suspend.c */ extern int i915_save_state(struct drm_device *dev); extern int i915_restore_state(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 69589e4..553b889 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1382,6 +1382,8 @@ static int intel_init_ring_buffer(struct drm_device *dev, if (IS_I830(ring-dev) || IS_845G(ring-dev)) ring-effective_size -= 128; + i915_cmd_parser_init_ring(ring); + return 0; err_unmap: diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 71a73f4..67305d3 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -162,6 +162,13 @@ struct intel_ring_buffer { u32 gtt_offset; volatile u32 *cpu_page; } scratch; + + /** +* Tables of commands the command parser needs to know about +* for this ring. +*/ + const struct drm_i915_cmd_table *cmd_tables; + int cmd_table_count; }; static inline bool -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Add HAS_CMD_PARSER parameter definition
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- include/drm/i915_drm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index c1914d6..525dc27 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -306,6 +306,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_ALIASING_PPGTT 18 #define I915_PARAM_HAS_WAIT_TIMEOUT 19 #define I915_PARAM_HAS_VEBOX22 +#define I915_PARAM_HAS_CMD_PARSER 28 typedef struct drm_i915_getparam { int param; -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/4] tests: Add a test for the command parser
From: Brad Volkin bradley.d.vol...@intel.com Start with a simple testcase that should pass. Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- tests/.gitignore | 1 + tests/Makefile.sources | 1 + tests/gem_exec_parse.c | 140 + 3 files changed, 142 insertions(+) create mode 100644 tests/gem_exec_parse.c diff --git a/tests/.gitignore b/tests/.gitignore index e835a5a..ec7f526 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -35,6 +35,7 @@ gem_exec_blt gem_exec_faulting_reloc gem_exec_lut_handle gem_exec_nop +gem_exec_parse gem_fenced_exec_thrash gem_fence_thrash gem_flink diff --git a/tests/Makefile.sources b/tests/Makefile.sources index a02b93d..c90e5aa 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -25,6 +25,7 @@ TESTS_progs_M = \ gem_evict_everything \ gem_exec_bad_domains \ gem_exec_nop \ + gem_exec_parse \ gem_fenced_exec_thrash \ gem_fence_thrash \ gem_flink \ diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c new file mode 100644 index 000..a17929f --- /dev/null +++ b/tests/gem_exec_parse.c @@ -0,0 +1,140 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include stdlib.h +#include stdint.h +#include stdio.h +#include drm.h +#include i915_drm.h +#include drmtest.h + +#ifndef I915_PARAM_HAS_CMD_PARSER +#define I915_PARAM_HAS_CMD_PARSER 28 +#endif + +static int exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds, + int size, int patch_offset, uint64_t expected_value) +{ + struct drm_i915_gem_execbuffer2 execbuf; + struct drm_i915_gem_exec_object2 objs[2]; + struct drm_i915_gem_relocation_entry reloc[1]; + + uint32_t target_bo = gem_create(fd, 4096); + uint64_t actual_value = 0; + + gem_write(fd, cmd_bo, 0, cmds, size); + + reloc[0].offset = patch_offset; + reloc[0].delta = 0; + reloc[0].target_handle = target_bo; + reloc[0].read_domains = I915_GEM_DOMAIN_RENDER; + reloc[0].write_domain = I915_GEM_DOMAIN_RENDER; + reloc[0].presumed_offset = 0; + + objs[0].handle = target_bo; + objs[0].relocation_count = 0; + objs[0].relocs_ptr = 0; + objs[0].alignment = 0; + objs[0].offset = 0; + objs[0].flags = 0; + objs[0].rsvd1 = 0; + objs[0].rsvd2 = 0; + + objs[1].handle = cmd_bo; + objs[1].relocation_count = 1; + objs[1].relocs_ptr = (uintptr_t)reloc; + objs[1].alignment = 0; + objs[1].offset = 0; + objs[1].flags = 0; + objs[1].rsvd1 = 0; + objs[1].rsvd2 = 0; + + execbuf.buffers_ptr = (uintptr_t)objs; + execbuf.buffer_count = 2; + execbuf.batch_start_offset = 0; + execbuf.batch_len = size; + execbuf.cliprects_ptr = 0; + execbuf.num_cliprects = 0; + execbuf.DR1 = 0; + execbuf.DR4 = 0; + execbuf.flags = I915_EXEC_RENDER; + i915_execbuffer2_set_context_id(execbuf, 0); + execbuf.rsvd2 = 0; + + gem_execbuf(fd, execbuf); + gem_sync(fd, cmd_bo); + + gem_read(fd,target_bo, 0, actual_value, sizeof(actual_value)); + igt_assert(expected_value == actual_value); + + gem_close(fd, target_bo); + + return 1; +} + +uint32_t handle; +int fd; + +#define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2) +#define PIPE_CONTROL_QW_WRITE(114) + +igt_main +{ + igt_fixture { + int has_secparser = 0; +drm_i915_getparam_t gp; + int rc; + + fd = drm_open_any(); + + gp.param = I915_PARAM_HAS_CMD_PARSER; + gp.value = has_secparser; + rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + igt_require(!rc has_secparser); + +
[Intel-gfx] [PATCH 2/4] tests/gem_exec_parse: Add tests for rejected commands
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- tests/gem_exec_parse.c | 81 ++ 1 file changed, 81 insertions(+) diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c index a17929f..b34fe1b 100644 --- a/tests/gem_exec_parse.c +++ b/tests/gem_exec_parse.c @@ -93,9 +93,55 @@ static int exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds, return 1; } +static int exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds, + int size, int ring, int expected_ret) +{ + struct drm_i915_gem_execbuffer2 execbuf; + struct drm_i915_gem_exec_object2 objs[1]; + int ret; + + gem_write(fd, cmd_bo, 0, cmds, size); + + objs[0].handle = cmd_bo; + objs[0].relocation_count = 0; + objs[0].relocs_ptr = 0; + objs[0].alignment = 0; + objs[0].offset = 0; + objs[0].flags = 0; + objs[0].rsvd1 = 0; + objs[0].rsvd2 = 0; + + execbuf.buffers_ptr = (uintptr_t)objs; + execbuf.buffer_count = 1; + execbuf.batch_start_offset = 0; + execbuf.batch_len = size; + execbuf.cliprects_ptr = 0; + execbuf.num_cliprects = 0; + execbuf.DR1 = 0; + execbuf.DR4 = 0; + execbuf.flags = ring; + i915_execbuffer2_set_context_id(execbuf, 0); + execbuf.rsvd2 = 0; + + ret = drmIoctl(fd, + DRM_IOCTL_I915_GEM_EXECBUFFER2, + execbuf); + if (ret == 0) + igt_assert(expected_ret == 0); + else + igt_assert(-errno == expected_ret); + + gem_sync(fd, cmd_bo); + + return 1; +} + uint32_t handle; int fd; +#define MI_ARB_ON_OFF (0x8 23) +#define MI_DISPLAY_FLIP ((0x14 23) | 1) + #define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2) #define PIPE_CONTROL_QW_WRITE(114) @@ -132,6 +178,41 @@ igt_main 0x1200)); } + igt_subtest(basic-rejected) { + uint32_t arb_on_off[] = { + MI_ARB_ON_OFF, + MI_BATCH_BUFFER_END, + }; + uint32_t display_flip[] = { + MI_DISPLAY_FLIP, + 0, 0, 0, + MI_BATCH_BUFFER_END, + 0 + }; + igt_assert( + exec_batch(fd, handle, + arb_on_off, sizeof(arb_on_off), + I915_EXEC_RENDER, + -EINVAL)); + igt_assert( + exec_batch(fd, handle, + arb_on_off, sizeof(arb_on_off), + I915_EXEC_BSD, + -EINVAL)); + if (gem_has_vebox(fd)) { + igt_assert( + exec_batch(fd, handle, + arb_on_off, sizeof(arb_on_off), + I915_EXEC_VEBOX, + -EINVAL)); + } + igt_assert( + exec_batch(fd, handle, + display_flip, sizeof(display_flip), + I915_EXEC_BLT, + -EINVAL)); + } + igt_fixture { gem_close(fd, handle); -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/4] tests/gem_exec_parse: Add tests for bitmask checks
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- tests/gem_exec_parse.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c index d4136db..0eb328c 100644 --- a/tests/gem_exec_parse.c +++ b/tests/gem_exec_parse.c @@ -145,6 +145,7 @@ int fd; #define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2) #define PIPE_CONTROL_QW_WRITE(114) +#define PIPE_CONTROL_LRI_POST_OP (123) igt_main { @@ -239,6 +240,23 @@ igt_main 0)); } + igt_subtest(bitmasks) { + uint32_t pc[] = { + GFX_OP_PIPE_CONTROL, + (PIPE_CONTROL_QW_WRITE | +PIPE_CONTROL_LRI_POST_OP), + 0, // To be patched + 0x1200, + 0, + MI_BATCH_BUFFER_END, + }; + igt_assert( + exec_batch(fd, handle, + pc, sizeof(pc), + I915_EXEC_RENDER, + -EINVAL)); + } + igt_fixture { gem_close(fd, handle); -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 19/22] drm/i915: Fix length handling for MFX_WAIT
From: Brad Volkin bradley.d.vol...@intel.com Signed-off-by: Brad Volkin bradley.d.vol...@intel.com --- drivers/gpu/drm/i915/i915_cmd_parser.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 5593740..adc7d94 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -193,7 +193,11 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = { .expected = 0 }}, .bits_count = 1 ), - CMD( MFX_WAIT, SMFX, !F, 0x3F, S ), + /* MFX_WAIT doesn't fit the way we handle length for most commands. +* It has a length field but it uses a non-standard length bias. +* It is always 1 dword though, so just treat it as fixed length. +*/ + CMD( MFX_WAIT, SMFX, F, 1, S ), }; static const struct drm_i915_cmd_descriptor vecs_cmds[] = { -- 1.8.4.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: take mode config lock around crtc disable at suspend
This is just a theoretical issue, but we need to do this to prevent the WARN in pipe_from_connector at suspend time. https://bugs.freedesktop.org/show_bug.cgi?id=71978 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 13076db..0ec0fb3 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -535,8 +535,10 @@ static int i915_drm_freeze(struct drm_device *dev) * Disable CRTCs directly since we want to preserve sw state * for _thaw. */ + mutex_lock(dev-mode_config.mutex); list_for_each_entry(crtc, dev-mode_config.crtc_list, head) dev_priv-display.crtc_disable(crtc); + mutex_unlock(dev-mode_config.mutex); intel_modeset_suspend_hw(dev); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2
On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote: And move it up in the file for earlier usage. v2: add pre-gen4 support as well (Chris) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_display.c | 53 ++-- 1 file changed, 38 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index e85d838..321d751 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc, pipe_config-port_clock = clock.dot / 5; } +static u32 +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int width, + int bpp, bool tiled) +{ + u32 pitch = DIV_ROUND_UP(width * bpp, 8); + int align; + + if (tiled) { + if (INTEL_INFO(dev_priv-dev)-gen 4) { + /* Pre-965 needs power of two tile width */ + for (align = 512; align pitch; align = 1) + ; + } else { + align = 512; + } Gen2 tiles are 128 bytes wide, not 512 bytes. So maybe something like this: if (IS_GEN2()) { return roundup_power_of_two(max(pitch, 128)); else if (IS_GEN3()) return roundup_power_of_two(max(pitch, 512)); else return ALIGN(pitch, 512); + } else { + align = 64; Also I just noticed in the docs that gen2/3 only seem to require 32byte alignment for linear buffers. But relaxing that would require a change to intel_framebuffer_init() as well, so it looks like material for another patch. Also would need to be tested on real hardware. + } + + return ALIGN(pitch, align); +} + static bool i9xx_get_pipe_config(struct intel_crtc *crtc, struct intel_crtc_config *pipe_config) { @@ -7652,16 +7674,12 @@ err: } static u32 -intel_framebuffer_pitch_for_width(int width, int bpp) -{ - u32 pitch = DIV_ROUND_UP(width * bpp, 8); - return ALIGN(pitch, 64); -} - -static u32 -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp) +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv, + struct drm_display_mode *mode, int bpp) { - u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp); + u32 pitch = intel_framebuffer_pitch_for_width(dev_priv, + mode-hdisplay, bpp, + false); return ALIGN(pitch * mode-vdisplay, PAGE_SIZE); This should also align the fb height to tile height, otherwise intel_framebuffer_init() might reject it. } @@ -7670,18 +7688,21 @@ intel_framebuffer_create_for_mode(struct drm_device *dev, struct drm_display_mode *mode, int depth, int bpp) { + struct drm_i915_private *dev_priv = dev-dev_private; struct drm_i915_gem_object *obj; struct drm_mode_fb_cmd2 mode_cmd = { 0 }; + int size; - obj = i915_gem_alloc_object(dev, - intel_framebuffer_size_for_mode(mode, bpp)); + size = intel_framebuffer_size_for_mode(dev_priv, mode, bpp); + obj = i915_gem_alloc_object(dev, size); if (obj == NULL) return ERR_PTR(-ENOMEM); mode_cmd.width = mode-hdisplay; mode_cmd.height = mode-vdisplay; - mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(mode_cmd.width, - bpp); + mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(dev_priv, + mode_cmd.width, + bpp, false); mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth); return intel_framebuffer_create(dev, mode_cmd, obj); @@ -7704,8 +7725,10 @@ mode_fits_in_fbdev(struct drm_device *dev, return NULL; fb = dev_priv-fbdev-ifb.base; - if (fb-pitches[0] intel_framebuffer_pitch_for_width(mode-hdisplay, - fb-bits_per_pixel)) + if (fb-pitches[0] intel_framebuffer_pitch_for_width(dev_priv, +mode-hdisplay, + fb-bits_per_pixel, +false)) return NULL; if (obj-base.size mode-vdisplay * fb-pitches[0]) -- 1.8.4.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx --
Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing
On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote: +static const struct drm_i915_cmd_descriptor* +find_cmd_in_table(const struct drm_i915_cmd_table *table, + u32 cmd_header) +{ + int i; + + for (i = 0; i table-count; i++) { + const struct drm_i915_cmd_descriptor *desc = table-table[i]; + u32 masked_cmd = desc-cmd.mask cmd_header; + u32 masked_value = desc-cmd.value desc-cmd.mask; + + if (masked_cmd == masked_value) + return desc; Maybe pre-sort the cmd table and use bsearch? And a runtime test on module load to check that the table is sorted correctly. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing
On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote: On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote: +static const struct drm_i915_cmd_descriptor* +find_cmd_in_table(const struct drm_i915_cmd_table *table, + u32 cmd_header) +{ + int i; + + for (i = 0; i table-count; i++) { + const struct drm_i915_cmd_descriptor *desc = table-table[i]; + u32 masked_cmd = desc-cmd.mask cmd_header; + u32 masked_value = desc-cmd.value desc-cmd.mask; + + if (masked_cmd == masked_value) + return desc; Maybe pre-sort the cmd table and use bsearch? And a runtime test on module load to check that the table is sorted correctly. So far it doesn't look like the search is a bottleneck, but I've tried to keep the tables sorted so that we could easily switch to bsearch if needed. Would you prefer to just use bsearch from the start? I agree that the module load check makes sense if we do switch. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2
On Tue, 26 Nov 2013 19:28:27 +0200 Ville Syrjälä ville.syrj...@linux.intel.com wrote: On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote: And move it up in the file for earlier usage. v2: add pre-gen4 support as well (Chris) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_display.c | 53 ++-- 1 file changed, 38 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index e85d838..321d751 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc, pipe_config-port_clock = clock.dot / 5; } +static u32 +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int width, + int bpp, bool tiled) +{ + u32 pitch = DIV_ROUND_UP(width * bpp, 8); + int align; + + if (tiled) { + if (INTEL_INFO(dev_priv-dev)-gen 4) { + /* Pre-965 needs power of two tile width */ + for (align = 512; align pitch; align = 1) + ; + } else { + align = 512; + } Gen2 tiles are 128 bytes wide, not 512 bytes. So maybe something like this: if (IS_GEN2()) { return roundup_power_of_two(max(pitch, 128)); else if (IS_GEN3()) return roundup_power_of_two(max(pitch, 512)); else return ALIGN(pitch, 512); + } else { + align = 64; Ah good old gen2. Pretty sure we've just treated it as 512 wide in the past? But I'm not going to go digging around the DDX to see... :) I was looking for that POT helper but didn't see it, that makes things a little nicer. Also I just noticed in the docs that gen2/3 only seem to require 32byte alignment for linear buffers. But relaxing that would require a change to intel_framebuffer_init() as well, so it looks like material for another patch. Also would need to be tested on real hardware. Yeah separate patch. It'll save us gobs of memory. :) -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp) +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv, + struct drm_display_mode *mode, int bpp) { - u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp); + u32 pitch = intel_framebuffer_pitch_for_width(dev_priv, + mode-hdisplay, bpp, + false); return ALIGN(pitch * mode-vdisplay, PAGE_SIZE); This should also align the fb height to tile height, otherwise intel_framebuffer_init() might reject it. Hm another existing bug. Separate patch. Thanks, -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4
On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote: Read out the current plane configuration at init time into a new plane_config structure. This allows us to track any existing framebuffers attached to the plane and potentially re-use them in our fbdev code for a smooth handoff. v2: update for new pitch_for_width function (Jesse) comment how get_plane_config works with shared fbs (Jesse) v3: s/ARGB/XRGB (Ville) use pipesrc width/height (Ville) fix fourcc comment (Bob) use drm_format_plane_cpp (Ville) v4: use fb for tracking fb data object (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/intel_display.c | 127 ++- drivers/gpu/drm/i915/intel_drv.h | 8 +++ 3 files changed, 136 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 14f250a..6569e93 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -365,6 +365,7 @@ struct drm_i915_error_state { struct intel_connector; struct intel_crtc_config; +struct intel_plane_config; struct intel_crtc; struct intel_limit; struct dpll; @@ -403,6 +404,8 @@ struct drm_i915_display_funcs { * fills out the pipe-config with the hw state. */ bool (*get_pipe_config)(struct intel_crtc *, struct intel_crtc_config *); + void (*get_plane_config)(struct intel_crtc *, + struct intel_plane_config *); int (*crtc_mode_set)(struct drm_crtc *crtc, int x, int y, struct drm_framebuffer *old_fb); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 321d751..089128b 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2002,6 +2002,27 @@ unsigned long intel_gen4_compute_page_offset(int *x, int *y, } } +int intel_format_to_fourcc(int format) +{ + switch (format) { + case DISPPLANE_8BPP: + return DRM_FORMAT_C8; + case DISPPLANE_BGRX555: + return DRM_FORMAT_XRGB1555; + case DISPPLANE_BGRX565: + return DRM_FORMAT_RGB565; + default: + case DISPPLANE_BGRX888: + return DRM_FORMAT_XRGB; + case DISPPLANE_RGBX888: + return DRM_FORMAT_XBGR; + case DISPPLANE_BGRX101010: + return DRM_FORMAT_XRGB2101010; + case DISPPLANE_RGBX101010: + return DRM_FORMAT_XBGR2101010; + } +} + static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb, int x, int y) { @@ -5474,6 +5495,95 @@ intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int width, return ALIGN(pitch, align); } +static void i9xx_get_plane_config(struct intel_crtc *crtc, + struct intel_plane_config *plane_config) +{ + struct drm_device *dev = crtc-base.dev; + struct drm_i915_private *dev_priv = dev-dev_private; + struct drm_i915_gem_object *obj = NULL; + struct drm_mode_fb_cmd2 mode_cmd = { 0 }; + u32 val, base, offset; + int pipe = crtc-pipe, plane = crtc-plane; + int fourcc, pixel_format; + + plane_config-fb = kzalloc(sizeof(*plane_config-fb), GFP_KERNEL); + if (!plane_config-fb) { + DRM_DEBUG_KMS(failed to alloc fb\n); + return; + } + + val = I915_READ(DSPCNTR(plane)); + + if (INTEL_INFO(dev)-gen = 4) + if (val DISPPLANE_TILED) + plane_config-tiled = true; + + pixel_format = val DISPPLANE_PIXFORMAT_MASK; + fourcc = intel_format_to_fourcc(pixel_format); + plane_config-fb-base.pixel_format = fourcc; + plane_config-fb-base.bits_per_pixel = + drm_format_plane_cpp(fourcc, 0) * 8; + + if (INTEL_INFO(dev)-gen = 4) { + if (plane_config-tiled) + offset = I915_READ(DSPTILEOFF(plane)); + else + offset = I915_READ(DSPLINOFF(plane)); + base = I915_READ(DSPSURF(plane)) 0xf000; + } else { + base = I915_READ(DSPADDR(plane)); + } + + val = I915_READ(PIPESRC(pipe)); + plane_config-fb-base.width = ((val 16) 0xfff) + 1; + plane_config-fb-base.height = ((val 0) 0xfff) + 1; + + plane_config-fb-base.pitches[0] = + intel_framebuffer_pitch_for_width(dev_priv, + plane_config-fb-base.width, + plane_config-fb-base.bits_per_pixel, + plane_config-tiled); Shouldn't we read out the stride from the respective hw register, too? + +
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7
On Tue, Nov 26, 2013 at 02:09:48PM +, Chris Wilson wrote: On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote: Retrieve current framebuffer config info from the regs and create an fb object for the buffer the BIOS or boot loader left us. This should allow for smooth transitions to userspace apps once we finish the initial configuration construction. v2: check for non-native modes and adjust (Jesse) fixup aperture and cmap frees (Imre) use unlocked unref if init_bios fails (Jesse) fix curly brace around DSPADDR check (Imre) comment failure path for pin_and_fence (Imre) v3: fixup fixup of aperture frees (Chris) v4: update to current bits (locking pin_and_fence hack) (Jesse) v5: move fb config fetch to display code (Jesse) re-order hw state readout on initial load to suit fb inherit (Jesse) re-add pin_and_fence in fbdev code to make sure we refcount properly (Je v6: rename to plane_config (Daniel) check for valid object when initializing BIOS fb (Jesse) split from plane_config readout and other display changes (Jesse) drop use_bios_fb option (Chris) update comments (Jesse) rework fbdev_init_bios for clarity (Jesse) drop fb obj ref under lock (Chris) v7: use fb object from plane_config instead (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding the kzalloc(intel_fbdev) and the clarity of intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb lifetime of plane_config-fb is extremely ugly though (the theft could be made a little more obvious for instance) and still leaked upon failure? I think the lifetime stuff for the stolen fb is actually ok. But there's other stuff that will probably gives us some good fireworks: - intel_crtc-plane_config seems to be left hanging in the air. Imo duplicating the crtc-fb pointer isnt' really all that good. I'd much prefer when we just shovel the invented fb into the crtc-fb pointer. Of course that requires us to properly adjust the refcount. - fb-obj has a very high chance to cause utter havoc on multi-pipe configs, since the bios loves to set up shared framebuffers. I guess this is untested? For now it's probably simplest to just not bother with the 2nd/3rd pipe. - We need to clean up the fbs we've created somehow - intel_framebuffer_init will at least register the framebuffer with the drm core. But since it's a driver-private fb and since we hold a reference already it'll never disappear I think. - Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing
On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote: On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote: On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote: +static const struct drm_i915_cmd_descriptor* +find_cmd_in_table(const struct drm_i915_cmd_table *table, + u32 cmd_header) +{ + int i; + + for (i = 0; i table-count; i++) { + const struct drm_i915_cmd_descriptor *desc = table-table[i]; + u32 masked_cmd = desc-cmd.mask cmd_header; + u32 masked_value = desc-cmd.value desc-cmd.mask; + + if (masked_cmd == masked_value) + return desc; Maybe pre-sort the cmd table and use bsearch? And a runtime test on module load to check that the table is sorted correctly. So far it doesn't look like the search is a bottleneck, but I've tried to keep the tables sorted so that we could easily switch to bsearch if needed. Would you prefer to just use bsearch from the start? Yes. I think it will be easier (especially if the codes does the runtime check) to keep the table sorted as commands are incremently added in the future, rather than having to retrofit the bsearch when it becomes a significant problem. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2
On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote: And move it up in the file for earlier usage. v2: add pre-gen4 support as well (Chris) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_display.c | 53 ++-- 1 file changed, 38 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index e85d838..321d751 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc, pipe_config-port_clock = clock.dot / 5; } +static u32 +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int width, + int bpp, bool tiled) +{ + u32 pitch = DIV_ROUND_UP(width * bpp, 8); + int align; + + if (tiled) { + if (INTEL_INFO(dev_priv-dev)-gen 4) { + /* Pre-965 needs power of two tile width */ + for (align = 512; align pitch; align = 1) + ; + } else { + align = 512; + } Imo we should just read this out from the hardware registers, that avoids us to have another copypasted version of some tile size limit code hanging around. -Daniel + } else { + align = 64; + } + + return ALIGN(pitch, align); +} + static bool i9xx_get_pipe_config(struct intel_crtc *crtc, struct intel_crtc_config *pipe_config) { @@ -7652,16 +7674,12 @@ err: } static u32 -intel_framebuffer_pitch_for_width(int width, int bpp) -{ - u32 pitch = DIV_ROUND_UP(width * bpp, 8); - return ALIGN(pitch, 64); -} - -static u32 -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp) +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv, + struct drm_display_mode *mode, int bpp) { - u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp); + u32 pitch = intel_framebuffer_pitch_for_width(dev_priv, + mode-hdisplay, bpp, + false); return ALIGN(pitch * mode-vdisplay, PAGE_SIZE); } @@ -7670,18 +7688,21 @@ intel_framebuffer_create_for_mode(struct drm_device *dev, struct drm_display_mode *mode, int depth, int bpp) { + struct drm_i915_private *dev_priv = dev-dev_private; struct drm_i915_gem_object *obj; struct drm_mode_fb_cmd2 mode_cmd = { 0 }; + int size; - obj = i915_gem_alloc_object(dev, - intel_framebuffer_size_for_mode(mode, bpp)); + size = intel_framebuffer_size_for_mode(dev_priv, mode, bpp); + obj = i915_gem_alloc_object(dev, size); if (obj == NULL) return ERR_PTR(-ENOMEM); mode_cmd.width = mode-hdisplay; mode_cmd.height = mode-vdisplay; - mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(mode_cmd.width, - bpp); + mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(dev_priv, + mode_cmd.width, + bpp, false); mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth); return intel_framebuffer_create(dev, mode_cmd, obj); @@ -7704,8 +7725,10 @@ mode_fits_in_fbdev(struct drm_device *dev, return NULL; fb = dev_priv-fbdev-ifb.base; - if (fb-pitches[0] intel_framebuffer_pitch_for_width(mode-hdisplay, - fb-bits_per_pixel)) + if (fb-pitches[0] intel_framebuffer_pitch_for_width(dev_priv, +mode-hdisplay, + fb-bits_per_pixel, +false)) return NULL; if (obj-base.size mode-vdisplay * fb-pitches[0]) -- 1.8.4.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: take mode config lock around crtc disable at suspend
On Tue, Nov 26, 2013 at 09:13:41AM -0800, Jesse Barnes wrote: This is just a theoretical issue, but we need to do this to prevent the WARN in pipe_from_connector at suspend time. https://bugs.freedesktop.org/show_bug.cgi?id=71978 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Picked up for -fixes, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion
On Tue, Nov 26, 2013 at 08:51:37AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com The length mask is different for each ring and the size can vary, so we should duplicate the definition with the correct encoding for each ring. Jumping in here since this highlights the most confusing aspect of this series - the meta patching. Please implement the command parsing infrastructure upfront and in a very small number of patches (most preferably one) that avoids having to add fixes late in the series. I think using s/S/ALLOW/ s/R/REJECT/ s/B/BLACKLIST/ s/W/WHITELIST/ makes the action much more clear, and would rather that every unsafe command starts off as REJECT. (With the whitelist/blacklisting being added as separate patches with justification as they are now.) Since we do disable the security, I would also reject all unknown/unmatched commands and make ALLOW explicit. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4
On Tue, 26 Nov 2013 18:43:53 +0100 Daniel Vetter dan...@ffwll.ch wrote: + plane_config-fb-base.pitches[0] = + intel_framebuffer_pitch_for_width(dev_priv, + plane_config-fb-base.width, + plane_config-fb-base.bits_per_pixel, + plane_config-tiled); Shouldn't we read out the stride from the respective hw register, too? Hm that's quite an idea. + + plane_config-size = ALIGN(plane_config-fb-base.pitches[0] * + plane_config-fb-base.height, PAGE_SIZE); If you bother with tiling I think you also need to bother with correct size alignement. Yeah just fixed up the framebuffer size function per Ville's comments. I should be able to just use that. @@ -10562,6 +10672,7 @@ static void intel_init_display(struct drm_device *dev) dev_priv-display.update_plane = ironlake_update_plane; Sad Haswell is sad it seems. Yeah for two reasons: I'm not testing HSW at all, and I'm still fairly ignorant of HSW display. Well that and I don't want to pile on more work for this patchset which has already seen a bunch of churn... + if (dev_priv-display.get_plane_config) + dev_priv-display.get_plane_config(crtc, + crtc-plane_config); + } If we go with Chris' suggestion to disable the crtc if we can't get at a sane framebuffer for it then this would make much more sense in the hardware state readout code. Especially since always having framebuffers around would allow us to ditch a bit of special-casing code all over the place. I don't think so; reading out and allocating an fb on every plane config readout would be overkill. We'd need to poke around in the struct for cross checking, then free it somewhere. Plus it won't always be preallocated, and creating a duplicate fb when the object still exists would fail. This is actually another argument for simply duplicating a few fields from the fb struct into the plane config struct. That makes cross checking and readout simple, and allows us to allocate if possible in the BIOS functions. But damnit, then I'd have to drop back to an earlier version of the patch and lose some changes... ugg -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/6] Enabling DRRS support in the kernel
On Tue, Nov 19, 2013 at 11:36:58AM +0530, Vandana Kannan wrote: Dynamic Refresh Rate Switching (DRRS) is a power conservation feature which enables switching between low and high refresh rates based on the usage scenario. This feature is applicable for internal eDP panel. Indication that the panel can support DRRS is given by the panel EDID, which would list multiple refresh rates for one resolution. DRRS is of 2 types - Static DRRS involves execution of the entire mode set sequence to switch between refresh rate. seamless DRRS involves refresh rate switching during runtime without any blanking effect/mode set. The vendor fills in a VBT field indicating static/seamless DRRS based on the panel spec. This information is needed to enable seamless DRRS in kernel. The patch series supports idleness detection in display i915 driver and switch to low refresh rate. It also provides set_property API for user space to request for different refresh rates for active use cases like video playback at 48/50 Hz. Pradeep Bhat (5): drm/i915: Adding VBT fields to support eDP DRRS feature drm/i915: Parse EDID probed modes for DRRS support drm/i915: Add support for DRRS set property to switch RR drm/i915: Support to read DMRRS field from VBT structure drm/i915: Adding support for DMRRS for media playback Vandana Kannan (1): drm/i915: Idleness detection for DRRS My apologies for delaying my high-level maintainer review for so long - there's been a bit a firedrill around here so it took a while to write it all down. Overall a nice patch series, but I think we need to shuffle a few things around to better align with some of the longer-term driver architecture reworks and goals: - You add another copypaste of the code to deduce the reduced clock from the edid to intel_dp.c. Imo it's better to add that to intel_panel.c to struct intel_panel - we already track the panel fixed mode in there, so this would naturally fit. - Imo we should track the reduced clock in the pipe config and also cross-check it in the state checker. That will lead to a bit of fun on bdw, so we need to special case the checker there so that it only checks that the clock matches either of the possible clocks. For this we need a bool downclock_available in struct intel_crtc_config and a 2nd set of m_n values, both set in the compute_config function for DP outputs. For consistency it'd be nice to at least move the downclock_available logic for lvds also over to that. But since we first need to clean up the dpll handling to make lvds downclocking sane that's imo not really a priority at all. The entire point behind all this pipe state tracking is to make sure we don't miss anything when fastbooting. - The connector attribute is imo the wrong interface - userspace already supplies a mode attribute with dotclock to SetCrtc. Imo we simply need to fix up our fixed_mode logic (preferrably as a neat helper in intel_panel.c) to select the right one of the availabel downclocks, even when upscaling. The downside for now is that this will result in flickering. But we need a real flicker-free fast-update-path in our modeset code anyway to make fastboot happen for real. And a few other cool things. I'll increase the pain a bit in the hope that this moves forward again, so no quick hack please (even if it's very simple to do in the case of drrs). - Finally a quick testcase which iterates through all the downclock modes in kms_flip - together with the cross-checking and all the vblank timing tests we already have that should give us solid test coverage. To keep test runtimes reasonable I think just a pageflipping test with time checking is more than enough. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7
On Tue, 26 Nov 2013 18:54:04 +0100 Daniel Vetter dan...@ffwll.ch wrote: On Tue, Nov 26, 2013 at 02:09:48PM +, Chris Wilson wrote: On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote: Retrieve current framebuffer config info from the regs and create an fb object for the buffer the BIOS or boot loader left us. This should allow for smooth transitions to userspace apps once we finish the initial configuration construction. v2: check for non-native modes and adjust (Jesse) fixup aperture and cmap frees (Imre) use unlocked unref if init_bios fails (Jesse) fix curly brace around DSPADDR check (Imre) comment failure path for pin_and_fence (Imre) v3: fixup fixup of aperture frees (Chris) v4: update to current bits (locking pin_and_fence hack) (Jesse) v5: move fb config fetch to display code (Jesse) re-order hw state readout on initial load to suit fb inherit (Jesse) re-add pin_and_fence in fbdev code to make sure we refcount properly (Je v6: rename to plane_config (Daniel) check for valid object when initializing BIOS fb (Jesse) split from plane_config readout and other display changes (Jesse) drop use_bios_fb option (Chris) update comments (Jesse) rework fbdev_init_bios for clarity (Jesse) drop fb obj ref under lock (Chris) v7: use fb object from plane_config instead (Ville) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding the kzalloc(intel_fbdev) and the clarity of intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb lifetime of plane_config-fb is extremely ugly though (the theft could be made a little more obvious for instance) and still leaked upon failure? I think the lifetime stuff for the stolen fb is actually ok. But there's other stuff that will probably gives us some good fireworks: - intel_crtc-plane_config seems to be left hanging in the air. Imo duplicating the crtc-fb pointer isnt' really all that good. I'd much prefer when we just shovel the invented fb into the crtc-fb pointer. Of course that requires us to properly adjust the refcount. plane_config is just part of intel_crtc. No hanging, the struct is just bigger. Saves me from dealing with alloc/free of it. - fb-obj has a very high chance to cause utter havoc on multi-pipe configs, since the bios loves to set up shared framebuffers. I guess this is untested? For now it's probably simplest to just not bother with the 2nd/3rd pipe. There should be no havoc. The first allocation will succeed, and following ones will fail since they overlap. The BIOS takeover code will then use the one that was allocated (or in the unlikely case of multiple fbs, pick the biggest one). Or did you see some other fail there? - We need to clean up the fbs we've created somehow - intel_framebuffer_init will at least register the framebuffer with the drm core. But since it's a driver-private fb and since we hold a reference already it'll never disappear I think. I don't think so... it should look just like a user allocated buffer from the drm core POV. But generally our fbdev allocations do tend to live forever, so in that sense you're right, but it's not different than what we have today. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 0/8] support for multiple power wells
2013/11/25 Imre Deak imre.d...@intel.com: v2: rebase on latest kernel + debug info on domain use counts v3: address review comments from Paulo All my comments got implemented or got some nice explanations, so, for the series: Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com Imre Deak (7): drm/i915: add audio power domain drm/i915: support for multiple power wells drm/i915: add always-on power wells instead of special casing them drm/i915: use IS_HASWELL/BROADWELL instead of HAS_POWER_WELL drm/i915: don't do BDW/HSW specific powerdomains init on other platforms drm/i915: add a default always-on power well drm/i915: add a debugfs entry for power domain info Jesse Barnes (1): drm/i915: protect HSW power well check with IS_HASWELL in redisable_vga drivers/gpu/drm/i915/i915_debugfs.c | 71 + drivers/gpu/drm/i915/i915_dma.c | 18 ++-- drivers/gpu/drm/i915/i915_drv.h | 18 +++- drivers/gpu/drm/i915/intel_display.c | 6 +- drivers/gpu/drm/i915/intel_pm.c | 198 +++ 5 files changed, 227 insertions(+), 84 deletions(-) -- 1.8.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing
On Tue, Nov 26, 2013 at 09:56:09AM -0800, Chris Wilson wrote: On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote: On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote: On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote: +static const struct drm_i915_cmd_descriptor* +find_cmd_in_table(const struct drm_i915_cmd_table *table, + u32 cmd_header) +{ + int i; + + for (i = 0; i table-count; i++) { + const struct drm_i915_cmd_descriptor *desc = table-table[i]; + u32 masked_cmd = desc-cmd.mask cmd_header; + u32 masked_value = desc-cmd.value desc-cmd.mask; + + if (masked_cmd == masked_value) + return desc; Maybe pre-sort the cmd table and use bsearch? And a runtime test on module load to check that the table is sorted correctly. So far it doesn't look like the search is a bottleneck, but I've tried to keep the tables sorted so that we could easily switch to bsearch if needed. Would you prefer to just use bsearch from the start? Yes. I think it will be easier (especially if the codes does the runtime check) to keep the table sorted as commands are incremently added in the future, rather than having to retrofit the bsearch when it becomes a significant problem. Ok, makes sense. I'll assume the same comment applies to the register whitelists and make similar changes there. Brad -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion
On Tue, Nov 26, 2013 at 10:08:48AM -0800, Chris Wilson wrote: On Tue, Nov 26, 2013 at 08:51:37AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com The length mask is different for each ring and the size can vary, so we should duplicate the definition with the correct encoding for each ring. Jumping in here since this highlights the most confusing aspect of this series - the meta patching. Please implement the command parsing infrastructure upfront and in a very small number of patches (most preferably one) that avoids having to add fixes late in the series. Sure. As this is my first contribution, I'm still working on how to best split up a series, so I'm happy to clean up that aspect. It sounds like the series should look more like: - All parser infrastructure and implementation (basically squash current 1-9, plus the bsearch changes, plus REJECT changes) - N patches to set commands for register whitelist and bitmask checking - Enable the parser Correct? I think using s/S/ALLOW/ s/R/REJECT/ s/B/BLACKLIST/ s/W/WHITELIST/ makes the action much more clear, and would rather that every unsafe command starts off as REJECT. (With the whitelist/blacklisting being added as separate patches with justification as they are now.) I had split out the REJECTs to make the justification easier to find in the commit message, but I can reject them from the start too. For 'B', the term 'blacklist' to me implies a set of things that are unconditionally not allowed, whereas the 'B' commands are conditionally allowed based on the bitmask checks. Are you just asking for a readability change in expanding the action names used in the table, or are you looking for any semantic changes to what the parser checks? Or am I overthinking this comment? Brad Since we do disable the security, I would also reject all unknown/unmatched commands and make ALLOW explicit. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote: This is one of the nice pieces that I've never ported from the old make based test runner. Note that we only use the result of the check when actually running the testcases so that enumerating tests still works as non-root on arbitrary machines. So does this bail on the first test, or does this run for all tests? BTW, Damien said he had also implemented this. Cc: Ben Widawsky b...@bwidawsk.net Requested-by: Ben Widawsky b...@bwidawsk.net Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- tests/igt.tests | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/tests/igt.tests b/tests/igt.tests index f3884925deaa..8d02c1a60255 100644 --- a/tests/igt.tests +++ b/tests/igt.tests @@ -28,7 +28,7 @@ import sys import subprocess from os import path -from framework.core import testBinDir, TestProfile +from framework.core import testBinDir, TestProfile, TestResult from framework.exectest import ExecTest # @@ -39,6 +39,24 @@ from framework.exectest import ExecTest # automatically add all tests into the 'igt' category. # +def checkEnvironment(): +debugfs_path = /sys/kernel/debug/dri +if os.getuid() != 0: +print Test Environment check: not root! +return False +if not os.path.isdir(debugfs_path): +print Test Environment check: debugfs not mounted properly! +return False +for subdir in os.listdir(debugfs_path): +clients = open(os.path.join(debugfs_path, subdir, clients), 'r') +lines = clients.readlines() +if len(lines) 2: +print Test Environment check: other drm clients running! +return False + +print Test Environment check: Succeeded. +return True + if not os.path.exists(os.path.join(testBinDir, 'igt')): print igt symlink not found! sys.exit(0) @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')): # Chase the piglit/bin/igt symlink to find where the tests really live. igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests') +igtEnvironmentOk = checkEnvironment() + profile = TestProfile() class IGTTest(ExecTest): @@ -54,6 +74,9 @@ class IGTTest(ExecTest): self.timeout = 60*20 # 20 minutes deadline by default def interpretResult(self, out, returncode, results, dmesg): +if not igtEnvironmentOk: +return out + if returncode == 0: results['result'] = 'dmesg-warn' if dmesg != '' else 'pass' elif returncode == 77: @@ -63,6 +86,13 @@ class IGTTest(ExecTest): return out def run(self, env): env.dmesg = True +if not igtEnvironmentOk: +results = TestResult() +results['result'] = 'skip' +results['info'] = unicode(Test Environment isn't OK) + +return results + return ExecTest.run(self, env) def listTests(listname): -- 1.8.1.4 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 0/8] support for multiple power wells
On Tue, Nov 26, 2013 at 04:46:54PM -0200, Paulo Zanoni wrote: 2013/11/25 Imre Deak imre.d...@intel.com: v2: rebase on latest kernel + debug info on domain use counts v3: address review comments from Paulo All my comments got implemented or got some nice explanations, so, for the series: Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com Slurped them all in, thanks for patchesreview. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: drop DRM_ERROR in intel_fbdev init
This should just be a debug. Add another debug msg to the inherit path while we're at it. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/intel_fbdev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c index fdb6dc9..284c3eb 100644 --- a/drivers/gpu/drm/i915/intel_fbdev.c +++ b/drivers/gpu/drm/i915/intel_fbdev.c @@ -127,11 +127,12 @@ static int intelfb_create(struct drm_fb_helper *helper, mutex_lock(dev-struct_mutex); if (!intel_fb-obj) { - DRM_ERROR(no BIOS fb, allocating a new one\n); + DRM_DEBUG_KMS(no BIOS fb, allocating a new one\n); ret = intelfb_alloc(helper, sizes); if (ret) goto out_unlock; } else { + DRM_DEBUG_KMS(re-using BIOS fb\n); sizes-fb_width = intel_fb-base.width; sizes-fb_height = intel_fb-base.height; } -- 1.8.4.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes through the checker and bypassing it (with EXEC_SECURE) would be really good. Maybe even some variable-sized commands (all the state setup stuff should be useful for that) to keep things interesting. Some variation is also important to have some good cache thrasing going on (since your check tables are fairly large I think). At this point there are a couple of known issues and potential improvements. 1) VLV. The parser is currently disabled for VLV. One type of check performed by the parser is that commands which access memory do so via PPGTT. VLV does not have PPGTT enabled at this time. I chose to implement the PPGTT checks via generic bit checking infrastructure in the parser, so they are not easily disabled for VLV. For now, I'm disabling parsing altogether in the hope that PPGTT can be enabled for VLV in the near future. We need ppgtt for the parser anyway, since otherwise userspace can submit a self-modifying batch. Checking for that is impossible, so allowing sw checked batches without the ppgtt/ggtt split would be a decent security hole. 2) Coherency. I've found two types of coherency issues when reading the
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 11:05:57AM -0800, Ben Widawsky wrote: On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote: This is one of the nice pieces that I've never ported from the old make based test runner. Note that we only use the result of the check when actually running the testcases so that enumerating tests still works as non-root on arbitrary machines. So does this bail on the first test, or does this run for all tests? It bails each test individually. Since otherwise enumerating them wouldn't work any more. I've hoped the commit message was clear about it. BTW, Damien said he had also implemented this. Only in the make target he created, not in piglit itself. Imo we should have all the testrunner logic in one place, i.e. in the piglit sources. -Daniel Cc: Ben Widawsky b...@bwidawsk.net Requested-by: Ben Widawsky b...@bwidawsk.net Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- tests/igt.tests | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/tests/igt.tests b/tests/igt.tests index f3884925deaa..8d02c1a60255 100644 --- a/tests/igt.tests +++ b/tests/igt.tests @@ -28,7 +28,7 @@ import sys import subprocess from os import path -from framework.core import testBinDir, TestProfile +from framework.core import testBinDir, TestProfile, TestResult from framework.exectest import ExecTest # @@ -39,6 +39,24 @@ from framework.exectest import ExecTest # automatically add all tests into the 'igt' category. # +def checkEnvironment(): +debugfs_path = /sys/kernel/debug/dri +if os.getuid() != 0: +print Test Environment check: not root! +return False +if not os.path.isdir(debugfs_path): +print Test Environment check: debugfs not mounted properly! +return False +for subdir in os.listdir(debugfs_path): +clients = open(os.path.join(debugfs_path, subdir, clients), 'r') +lines = clients.readlines() +if len(lines) 2: +print Test Environment check: other drm clients running! +return False + +print Test Environment check: Succeeded. +return True + if not os.path.exists(os.path.join(testBinDir, 'igt')): print igt symlink not found! sys.exit(0) @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')): # Chase the piglit/bin/igt symlink to find where the tests really live. igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests') +igtEnvironmentOk = checkEnvironment() + profile = TestProfile() class IGTTest(ExecTest): @@ -54,6 +74,9 @@ class IGTTest(ExecTest): self.timeout = 60*20 # 20 minutes deadline by default def interpretResult(self, out, returncode, results, dmesg): +if not igtEnvironmentOk: +return out + if returncode == 0: results['result'] = 'dmesg-warn' if dmesg != '' else 'pass' elif returncode == 77: @@ -63,6 +86,13 @@ class IGTTest(ExecTest): return out def run(self, env): env.dmesg = True +if not igtEnvironmentOk: +results = TestResult() +results['result'] = 'skip' +results['info'] = unicode(Test Environment isn't OK) + +return results + return ExecTest.run(self, env) def listTests(listname): -- 1.8.1.4 -- Ben Widawsky, Intel Open Source Technology Center -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 08:46:19PM +0100, Daniel Vetter wrote: On Tue, Nov 26, 2013 at 11:05:57AM -0800, Ben Widawsky wrote: On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote: This is one of the nice pieces that I've never ported from the old make based test runner. Note that we only use the result of the check when actually running the testcases so that enumerating tests still works as non-root on arbitrary machines. So does this bail on the first test, or does this run for all tests? It bails each test individually. Since otherwise enumerating them wouldn't work any more. I've hoped the commit message was clear about it. It wasn't clear to me, though it seemed to be the case from the diff. I'd prefer an early bail, and just check once, and ideally, not generating a JSON file at all. BTW, Damien said he had also implemented this. Only in the make target he created, not in piglit itself. Imo we should have all the testrunner logic in one place, i.e. in the piglit sources. -Daniel Damien, can you comment? I could have sworn you said something different on IRC. It sounded like exactly what I wanted. Cc: Ben Widawsky b...@bwidawsk.net Requested-by: Ben Widawsky b...@bwidawsk.net Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- tests/igt.tests | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/tests/igt.tests b/tests/igt.tests index f3884925deaa..8d02c1a60255 100644 --- a/tests/igt.tests +++ b/tests/igt.tests @@ -28,7 +28,7 @@ import sys import subprocess from os import path -from framework.core import testBinDir, TestProfile +from framework.core import testBinDir, TestProfile, TestResult from framework.exectest import ExecTest # @@ -39,6 +39,24 @@ from framework.exectest import ExecTest # automatically add all tests into the 'igt' category. # +def checkEnvironment(): +debugfs_path = /sys/kernel/debug/dri +if os.getuid() != 0: +print Test Environment check: not root! +return False +if not os.path.isdir(debugfs_path): +print Test Environment check: debugfs not mounted properly! +return False +for subdir in os.listdir(debugfs_path): +clients = open(os.path.join(debugfs_path, subdir, clients), 'r') +lines = clients.readlines() +if len(lines) 2: +print Test Environment check: other drm clients running! +return False + +print Test Environment check: Succeeded. +return True + if not os.path.exists(os.path.join(testBinDir, 'igt')): print igt symlink not found! sys.exit(0) @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')): # Chase the piglit/bin/igt symlink to find where the tests really live. igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests') +igtEnvironmentOk = checkEnvironment() + profile = TestProfile() class IGTTest(ExecTest): @@ -54,6 +74,9 @@ class IGTTest(ExecTest): self.timeout = 60*20 # 20 minutes deadline by default def interpretResult(self, out, returncode, results, dmesg): +if not igtEnvironmentOk: +return out + if returncode == 0: results['result'] = 'dmesg-warn' if dmesg != '' else 'pass' elif returncode == 77: @@ -63,6 +86,13 @@ class IGTTest(ExecTest): return out def run(self, env): env.dmesg = True +if not igtEnvironmentOk: +results = TestResult() +results['result'] = 'skip' +results['info'] = unicode(Test Environment isn't OK) + +return results + return ExecTest.run(self, env) def listTests(listname): -- 1.8.1.4 -- Ben Widawsky, Intel Open Source Technology Center -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote: Only in the make target he created, not in piglit itself. Imo we should have all the testrunner logic in one place, i.e. in the piglit sources. -Daniel Damien, can you comment? I could have sworn you said something different on IRC. It sounded like exactly what I wanted. What Daniel says is correct, the check is part of the runner wrapper, not piglit itself. I'd rather have a environement check up-front and I don't mind where it lives (igt Vs piglit). -- Damien ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 8:53 PM, Damien Lespiau damien.lesp...@intel.com wrote: On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote: Only in the make target he created, not in piglit itself. Imo we should have all the testrunner logic in one place, i.e. in the piglit sources. -Daniel Damien, can you comment? I could have sworn you said something different on IRC. It sounded like exactly what I wanted. What Daniel says is correct, the check is part of the runner wrapper, not piglit itself. I'd rather have a environement check up-front and I don't mind where it lives (igt Vs piglit). The problem is that generating the testlist (or printing the commands) is a feature QA actually relies on. I also use it occasionally to quickly test igt library changes. So we can't bail that early. My patch bails fairly late, but I didn't see a better spot. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/4] xf86-video-intel DRI3 and Present patch series
Chris Wilson ch...@chris-wilson.co.uk writes: [PATCH 1/4] Support 64-bit MSC values. Handle kernel vageries about Some spurious assignments that appear to intentially drop the error code could be clarified, I can't find any dropped error codes in this patch to add comments to, please provide patch excerpts for this review. and intel_crtc_msc_to_sequence() is always called with a derived current_msc already to hand. The latter present path obfuscates its derived current_msc. Present always computes absolute MSC values and provides those to the driver, instead of expecting every driver to duplicate that logic. [PATCH 2/4] Restructure DRM event handling. This won't compile against older Xorg due to xorg_list in the common code. Can switch to intel_list, but that would need list_for_each_entry_safe added. How many versions back is this supposed to compile against? [PATCH 3/4] Add DRI3 and miSyncShm support O_CLOEXEC needs protecting, also would appear to be candidate for a render-node. Yes, obviously this wants to use render-node. I haven't had complaints about O_CLOEXEC from BSD or Solaris developers for libxshmfence; what systems do not have support for this? The imported and exported DRI3 pixmaps need to be pinned to prevent the driver using BO exchanges on that pixmap. I don't understand this comment. DRI3 doesn't respect the xorg.conf Option for disabling. Ok, it should check intel-directRenderingType == DRI_DISABLED. A fence is only tied to a screen and no XID or Client in particular? DRI3 fences are screen-specific (otherwise you'd have no way of hooking the fence to a specific driver). So it is a global operation akin to intel_flush_callback() which would be called before the Sync reply was sent. Yes, the hardware queue is to be flushed before the Sync event is sent (and before the xshmfence object is triggered, of course). Note that this is just the mi version of sync fences, which use libxshmfence; the driver is free to use different code there. If we find that the code for handling these xshmfence objects is common across drivers, we can move that into the X server to share. [PATCH 4/4] Add Present extension support Yikes. The patch is itself fairly innoculous, but only because the Present extension in the server appears to be repeating the worst of DRI2, including its original bugs. Please provide more specific comments here. The fallback/non-fullscreen case is not synchronised to screen refresh (if the Client so desired), and should be passed through to the driver. The fallback case is synchronized as the Present code triggers the CopyArea call from the vblank hook. In practice, this has proven sufficient to get images onto the screen without tearing and without requiring a huge amount of driver and kernel infrastructure. The whole driver interface seems to be too low a level, baking in many assumptions, rather the usual approach of providing a set of mi routines that the driver can plug into or not as the case may be. Patches to the X server to change the API for better hardware support are welcome, of course. That the WindowPixmap no longer points to the actual bo leads to a few problems, such as the CRTC misconfiguration and GetImage being broken after a PresentFlip. A patch for the X server to fix that has been posted. After a vblank_event, Present must check that the flip is still valid before execution. The flip proc may return FALSE to indicate failure of any kind. Present will then fall-back to a simple blt. In the backend it is not clear whether the RRCrtc should be the primary CRTC or the only CRTC to flip. There is only one screen pixmap, so of course every CRTC must flip together. The CRTC provided indicates which one the MSC is from. Damage is processed after the fallback but not the Flip path, the lack of Damage notification would upset Prime amongst others. Sounds easy to fix in the X server. Thanks for your review! -- keith.pack...@intel.com pgpVceKq_Q16e.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 08:55:40PM +0100, Daniel Vetter wrote: On Tue, Nov 26, 2013 at 8:53 PM, Damien Lespiau damien.lesp...@intel.com wrote: On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote: Only in the make target he created, not in piglit itself. Imo we should have all the testrunner logic in one place, i.e. in the piglit sources. -Daniel Damien, can you comment? I could have sworn you said something different on IRC. It sounded like exactly what I wanted. What Daniel says is correct, the check is part of the runner wrapper, not piglit itself. I'd rather have a environement check up-front and I don't mind where it lives (igt Vs piglit). The problem is that generating the testlist (or printing the commands) is a feature QA actually relies on. I also use it occasionally to quickly test igt library changes. So we can't bail that early. My patch bails fairly late, but I didn't see a better spot. -Daniel I'm confused then about how this really improves my current situation. -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote: Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. Agreed. I thought we could handle this as a follow-up task once the basic stuff is in place, particularly given that we'd want to modify at least some users to test. I also wasn't sure if we would want the check to be root master, as in the current secure flag, or just master. W.r.t. the tests, I suppose we can just turn checking on for secure batches and see what happens. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. Good suggestions. I'll look into these. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes through the checker and bypassing it (with EXEC_SECURE) would be really good. Maybe even some variable-sized commands (all the state setup stuff should be useful for that) to keep things interesting. Some variation is also important to have some good cache thrasing going on (since your check tables are fairly large I think). Ok. I'd be interested in some comment from the mesa and libva guys here for real world workloads, but a microbenchmark would be a good start. Which state setup stuff are you referring to? Something specific in i-g-t or something more general? At this point
[Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
This is one of the nice pieces that I've never ported from the old make based test runner. Note that we only use the result of the check when actually running the testcases so that enumerating tests still works as non-root on arbitrary machines. v2: Fail the tests harder. Cc: Ben Widawsky b...@bwidawsk.net Requested-by: Ben Widawsky b...@bwidawsk.net Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- tests/igt.tests | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/tests/igt.tests b/tests/igt.tests index f3884925deaa..df747e3fac78 100644 --- a/tests/igt.tests +++ b/tests/igt.tests @@ -28,7 +28,7 @@ import sys import subprocess from os import path -from framework.core import testBinDir, TestProfile +from framework.core import testBinDir, TestProfile, TestResult from framework.exectest import ExecTest # @@ -39,6 +39,24 @@ from framework.exectest import ExecTest # automatically add all tests into the 'igt' category. # +def checkEnvironment(): +debugfs_path = /sys/kernel/debug/dri +if os.getuid() != 0: +print Test Environment check: not root! +return False +if not os.path.isdir(debugfs_path): +print Test Environment check: debugfs not mounted properly! +return False +for subdir in os.listdir(debugfs_path): +clients = open(os.path.join(debugfs_path, subdir, clients), 'r') +lines = clients.readlines() +if len(lines) 2: +print Test Environment check: other drm clients running! +return False + +print Test Environment check: Succeeded. +return True + if not os.path.exists(os.path.join(testBinDir, 'igt')): print igt symlink not found! sys.exit(0) @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')): # Chase the piglit/bin/igt symlink to find where the tests really live. igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests') +igtEnvironmentOk = checkEnvironment() + profile = TestProfile() class IGTTest(ExecTest): @@ -54,6 +74,9 @@ class IGTTest(ExecTest): self.timeout = 60*20 # 20 minutes deadline by default def interpretResult(self, out, returncode, results, dmesg): +if not igtEnvironmentOk: +return out + if returncode == 0: results['result'] = 'dmesg-warn' if dmesg != '' else 'pass' elif returncode == 77: @@ -63,6 +86,13 @@ class IGTTest(ExecTest): return out def run(self, env): env.dmesg = True +if not igtEnvironmentOk: +results = TestResult() +results['result'] = 'fail' +results['info'] = unicode(Test Environment isn't OK) + +return results + return ExecTest.run(self, env) def listTests(listname): -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks
On Tue, Nov 26, 2013 at 9:02 PM, Ben Widawsky b...@bwidawsk.net wrote: The problem is that generating the testlist (or printing the commands) is a feature QA actually relies on. I also use it occasionally to quickly test igt library changes. So we can't bail that early. My patch bails fairly late, but I didn't see a better spot. -Daniel I'm confused then about how this really improves my current situation. Quick recap of our irc discussion: I guess I'm trying to solve a slightly different problem, assuming that we always want some kind of test result. Hence everything fails (as of v2) and the result json contains the reason. I think that's the right thing to do for regression testing in a lab-like setting. For developers I guess a little script or so some option to yell louder and faster than this here could still be useful. Otoh there'll always be test-specific failure modes (like runtime pm not configured properly) and imo it doesn't make much sense to filter for all of them beforehand. And otherwise I kinda expect that people will cook their own scripts anyway to kill X, Wayland and a bunch of deamons that get in the way and then run piglit with a few default options. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric
On Tue, 26 Nov 2013 15:10:22 -0800 Jesse Barnes jbar...@virtuousgeek.org wrote: On Mon, 11 Nov 2013 10:23:24 +0100 Daniel Vetter dan...@ffwll.ch wrote: On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote: On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote: vlv_dpio_read/write should be describe more in PHY centric instead of display controller centric. Create a enum dpio_channel for channel index and enum dpio_phy for PHY index. This should better to gather for upcoming platform. v2: Rebase the code based on drm/i915/vlv: Fix typo in the DPIO register define. v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro DPIO_PHY, and remove unrelated change. (Ville) Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chon Ming Lee chon.ming@intel.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Queued for -next, thanks for the patch. Looks like this one gives me bogus DPIO values at least some of the time. Reverting to using 0x12 as the port ID seems to get me valid values back... Ah looks like the init_dpio happens too late for the mode state readout. I'll post a patch to move it up. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric
On Mon, 11 Nov 2013 10:23:24 +0100 Daniel Vetter dan...@ffwll.ch wrote: On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote: On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote: vlv_dpio_read/write should be describe more in PHY centric instead of display controller centric. Create a enum dpio_channel for channel index and enum dpio_phy for PHY index. This should better to gather for upcoming platform. v2: Rebase the code based on drm/i915/vlv: Fix typo in the DPIO register define. v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro DPIO_PHY, and remove unrelated change. (Ville) Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chon Ming Lee chon.ming@intel.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Queued for -next, thanks for the patch. Looks like this one gives me bogus DPIO values at least some of the time. Reverting to using 0x12 as the port ID seems to get me valid values back... -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fb2d548..a60894d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) BUG_ON(obj-base.read_domains I915_GEM_GPU_DOMAINS); BUG_ON(obj-base.write_domain I915_GEM_GPU_DOMAINS); - st = kmalloc(sizeof(*st), GFP_KERNEL); + st = kmalloc(sizeof(*st), GFP_NOWAIT); if (st == NULL) return -ENOMEM; page_count = obj-base.size / PAGE_SIZE; - if (sg_alloc_table(st, page_count, GFP_KERNEL)) { + if (sg_alloc_table(st, page_count, GFP_NOWAIT)) { kfree(st); return -ENOMEM; } -- 1.8.4.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) It appears that by itself, this patch is insufficient to prevent the failure when run in piglit. Before I go on a wild goose chase of finding all allocations, maybe I'll give people a chance to chime in. The symptom is the same always, OOM, procs can't die because struct_mutex is held. Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fb2d548..a60894d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) BUG_ON(obj-base.read_domains I915_GEM_GPU_DOMAINS); BUG_ON(obj-base.write_domain I915_GEM_GPU_DOMAINS); - st = kmalloc(sizeof(*st), GFP_KERNEL); + st = kmalloc(sizeof(*st), GFP_NOWAIT); if (st == NULL) return -ENOMEM; page_count = obj-base.size / PAGE_SIZE; - if (sg_alloc_table(st, page_count, GFP_KERNEL)) { + if (sg_alloc_table(st, page_count, GFP_NOWAIT)) { kfree(st); return -ENOMEM; } -- 1.8.4.2 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
On Tue, Nov 26, 2013 at 05:10:38PM -0800, Ben Widawsky wrote: On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) It appears that by itself, this patch is insufficient to prevent the failure when run in piglit. Before I go on a wild goose chase of finding all allocations, maybe I'll give people a chance to chime in. The symptom is the same always, OOM, procs can't die because struct_mutex is held. too late on the goose chase. vma allocation was the missing bit. Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fb2d548..a60894d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) BUG_ON(obj-base.read_domains I915_GEM_GPU_DOMAINS); BUG_ON(obj-base.write_domain I915_GEM_GPU_DOMAINS); - st = kmalloc(sizeof(*st), GFP_KERNEL); + st = kmalloc(sizeof(*st), GFP_NOWAIT); if (st == NULL) return -ENOMEM; page_count = obj-base.size / PAGE_SIZE; - if (sg_alloc_table(st, page_count, GFP_KERNEL)) { + if (sg_alloc_table(st, page_count, GFP_NOWAIT)) { kfree(st); return -ENOMEM; } -- 1.8.4.2 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Tue, 2013-11-26 at 20:35 +0100, Daniel Vetter wrote: Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). Besides the vcs ring, we also use batchbuffer chaining on the render ring for video post processing, video motion estimation and motion compensation(on ILK), A fixed length batch buffer isn't suitable for those operations as those operations are based on a macroblock instead of a frame. It would be better to make sure batchbuffer chaining works on the render ring too. - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes through the checker and bypassing it (with EXEC_SECURE) would be really good. Maybe even some variable-sized commands (all the state setup stuff should be useful for that) to keep things interesting. Some variation is also important to have some good cache thrasing going on (since your check tables are fairly large I think). At this point there are a couple of known issues and potential improvements. 1) VLV. The parser is currently disabled for VLV. One type of check performed by the parser is that commands which access memory do so via PPGTT. VLV does not have PPGTT enabled at this time. I chose to implement the
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Tue, 2013-11-26 at 13:24 -0700, Volkin, Bradley D wrote: On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote: Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. Agreed. I thought we could handle this as a follow-up task once the basic stuff is in place, particularly given that we'd want to modify at least some users to test. I also wasn't sure if we would want the check to be root master, as in the current secure flag, or just master. W.r.t. the tests, I suppose we can just turn checking on for secure batches and see what happens. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. Good suggestions. I'll look into these. Hi, Brad More inputs from libva about the batchbuffer chaining. Now the batchbuffer chaining is widely used in libva driver. This is related with how the libva driver processes the image. For the encoding purpose, it needs to be handled based on macroblock(16x16).And every macroblock needs a group of GPU commands. So the GPU commands for all the macroblocks will be constructed in the second-level batchbuffer. The mode of batchbuffer chaining will bring the following benefits: a. The size of second-level batch buffer can be allocated based on the size of handled image. For example: 1080p/720p/480p can use the different size. b. The gpu commands in second-level batchbuffer can be constructed by using GPU instead of CPU, which is helpful to improve the performance. At the same time both VCS and Render Ring are used in libva driver. For example: The encoding will use VCS and RCS ring. Firstly the RCS ring is used to execute GPU command for the motion vector/mode prediction. And then the VCS Ring is used to execute the GPU command for generating the bit-stream. So not only VCS ring uses the mode of batchbuffer chaining, but also the Render
Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net I'd still like to know if I am crazy, but I'm now trying to defer the stuff we do on file close without using any allocs. Just an update... -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
On Wed, Nov 27, 2013 at 5:23 AM, Ben Widawsky b...@bwidawsk.net wrote: On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net I'd still like to know if I am crazy, but I'm now trying to defer the stuff we do on file close without using any allocs. Just an update... Sound's intrigueing, but tbh I don't really have clue about things. What about adding the relevant stuck task backtraces to the patch and submitting this to a wider audience (lkml, mm-devel) as an akpm-probe? The more botched the patch, the better the probe usually. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path
On Tue, Nov 26, 2013 at 08:23:46PM -0800, Ben Widawsky wrote: On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: If we end up calling the shrinker, which in turn requires the OOM killer, we may end up infinitely waiting for a process to die if the OOM chooses. The case that this prevents occurs in execbuf. The forked variants of gem_evict_everything is a good way to hit it. This is exacerbated by Daniel's recent patch to give OOM precedence to the GEM tests. It's a twisted form of a deadlock. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Ben Widawsky b...@bwidawsk.net I'd still like to know if I am crazy, but I'm now trying to defer the stuff we do on file close without using any allocs. Just an update... workqueue still has similar problems. It could be because deferring the context cleanup means we don't actually free much space, and so the OOM isn't enough, or [more likely] something else is going on. Maybe it's my bug. I am really out of ideas at the moment. The system just becomes unresponsive after all threads end up blocked waiting for struct mutex. I know I'd seen such problems in the past with gem_evict_everything, but this time around I seem to be the sole cause (and not all my machines hit it). Sorry for the noise - just totally burning out on this one. -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric
On Wed, Nov 27, 2013 at 12:18 AM, Jesse Barnes jbar...@virtuousgeek.org wrote: On Tue, 26 Nov 2013 15:10:22 -0800 Jesse Barnes jbar...@virtuousgeek.org wrote: On Mon, 11 Nov 2013 10:23:24 +0100 Daniel Vetter dan...@ffwll.ch wrote: On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote: On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote: vlv_dpio_read/write should be describe more in PHY centric instead of display controller centric. Create a enum dpio_channel for channel index and enum dpio_phy for PHY index. This should better to gather for upcoming platform. v2: Rebase the code based on drm/i915/vlv: Fix typo in the DPIO register define. v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro DPIO_PHY, and remove unrelated change. (Ville) Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com Signed-off-by: Chon Ming Lee chon.ming@intel.com Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com Queued for -next, thanks for the patch. Looks like this one gives me bogus DPIO values at least some of the time. Reverting to using 0x12 as the port ID seems to get me valid values back... Ah looks like the init_dpio happens too late for the mode state readout. I'll post a patch to move it up. Isn't that just because your fb reconstruction patches moves it up by a lot? If so can you please extract just that from your patches? I was wondering whether we should do that due to the usual init ordering fun anyway. I'd prefer to let that one soak a few days in dinq or so before pulling in the hairy stuff. Aside: The s/intelfbdev.ifb/intelfbdev-fb/ conversion would also be neater split out. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Optionally defer context closing
This patch doesn't seem to be the panacea that I had hoped, although I've yet to thoroughly test if it actually improves some of the other tests which caused trouble. What occurs is the following (assume just 2 procs) 1. proc A gets to execbuf while out of memory, gets struct_mutex. 2. OOM killer comes in and chooses proc B 3. proc B closes it's fds, which requires struct mutex, blocks 4, OOM killer waits for B to die before killing another process (this part is speculative) In order to let the OOM complete, we defer processing the context destruction parts which are those that require struct_mutex. This patch has not really been cleaned up, I am only posting it for posterity. (Several of the hunks are only relevant to full PPGTT patches which are not merged). Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_debugfs.c | 6 ++- drivers/gpu/drm/i915/i915_drv.h | 9 +++-- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_context.c | 68 - 4 files changed, 61 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 835a43e..e6b5f3e 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1717,11 +1717,13 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) struct drm_i915_file_private *file_priv = file-driver_priv; struct i915_hw_ppgtt *pvt_ppgtt; - pvt_ppgtt = ctx_to_ppgtt(file_priv-private_default_ctx); + pvt_ppgtt = + ctx_to_ppgtt(file_priv-ctx_info-private_default_ctx); seq_printf(m, proc: %s\n, get_pid_task(file-pid, PIDTYPE_PID)-comm); seq_puts(m, default context:\n); - idr_for_each(file_priv-context_idr, per_file_ctx, m); + idr_for_each(file_priv-ctx_info-context_idr, per_file_ctx, +m); } seq_printf(m, ECOCHK: 0x%08x\n, I915_READ(GAM_ECOCHK)); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6fdab40..931fc42 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1735,15 +1735,18 @@ struct drm_i915_gem_request { struct drm_i915_file_private { struct drm_i915_private *dev_priv; - struct { spinlock_t lock; struct list_head request_list; struct delayed_work idle_work; } mm; - struct idr context_idr; + struct i915_gem_ctx_info { + struct drm_i915_private *dev_priv; + struct idr context_idr; + struct work_struct close_work; + struct i915_hw_context *private_default_ctx; + } *ctx_info; - struct i915_hw_context *private_default_ctx; atomic_t rps_wait_boost; }; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 6312d61..a32f6df 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2326,7 +2326,7 @@ static void i915_set_reset_status(struct intel_ring_buffer *ring, if (request-ctx request-ctx-id != DEFAULT_CONTEXT_ID) hs = request-ctx-hang_stats; else if (request-file_priv) - hs = request-file_priv-private_default_ctx-hang_stats; + hs = request-file_priv-ctx_info-private_default_ctx-hang_stats; if (hs) { if (guilty) { diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index f55f0a9..770b394 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -209,8 +209,8 @@ __create_hw_context(struct drm_device *dev, if (file_priv == NULL) return ctx; - ret = idr_alloc(file_priv-context_idr, ctx, DEFAULT_CONTEXT_ID, 0, - GFP_KERNEL); + ret = idr_alloc(file_priv-ctx_info-context_idr, + ctx, DEFAULT_CONTEXT_ID, 0, GFP_KERNEL); if (ret 0) goto err_out; @@ -481,29 +481,54 @@ static int context_idr_cleanup(int id, void *p, void *data) return 0; } +static void i915_gem_context_close_work_handler(struct work_struct *work) +{ + struct i915_gem_ctx_info *ctx_info = container_of(work, + struct i915_gem_ctx_info, + close_work); + struct drm_i915_private *dev_priv = ctx_info-dev_priv; + + mutex_lock(dev_priv-dev-struct_mutex); + idr_for_each(ctx_info-context_idr, context_idr_cleanup, NULL); + i915_gem_context_unreference(ctx_info-private_default_ctx); + idr_destroy(ctx_info-context_idr); + mutex_unlock(dev_priv-dev-struct_mutex); + +
Re: [Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer
On Tue, Nov 26, 2013 at 11:23:15AM +, Chris Wilson wrote: As the execbuffer dispatch grows ever more complex and involves multiple stages of moving objects into the aperture, we need to take greater care that we do not evict our execbuffer objects prior to dispatch. This is relatively simple as we can just keep the objects pinned for not just the relocation but until we are finished. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Ben Widawsky benjamin.widaw...@intel.com Cc: Daniel Vetter dan...@ffwll.ch Cc: sta...@vger.kernel.org To be honest, I don't quite see the actual issue, but the problem description, and solution make sense to me. I've also been running with the patch quite a bit on HSW. Acked-by: Ben Widawsky b...@bwidawsk.net Tested-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 -- 1 file changed, 32 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 885d595e0e02..b7e787fb4649 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -33,6 +33,9 @@ #include intel_drv.h #include linux/dma_remapping.h +#define __EXEC_OBJECT_HAS_PIN (131) +#define __EXEC_OBJECT_HAS_FENCE (130) + struct eb_vmas { struct list_head vmas; int and; @@ -187,7 +190,28 @@ static struct i915_vma *eb_get_vma(struct eb_vmas *eb, unsigned long handle) } } -static void eb_destroy(struct eb_vmas *eb) { +static void +i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma) +{ + struct drm_i915_gem_exec_object2 *entry; + struct drm_i915_gem_object *obj = vma-obj; + + if (!drm_mm_node_allocated(vma-node)) + return; + + entry = vma-exec_entry; + + if (entry-flags __EXEC_OBJECT_HAS_FENCE) + i915_gem_object_unpin_fence(obj); + + if (entry-flags __EXEC_OBJECT_HAS_PIN) + i915_gem_object_unpin(obj); + + entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN); +} + +static void eb_destroy(struct eb_vmas *eb) +{ while (!list_empty(eb-vmas)) { struct i915_vma *vma; @@ -195,6 +219,7 @@ static void eb_destroy(struct eb_vmas *eb) { struct i915_vma, exec_list); list_del_init(vma-exec_list); + i915_gem_execbuffer_unreserve_vma(vma); drm_gem_object_unreference(vma-obj-base); } kfree(eb); @@ -478,9 +503,6 @@ i915_gem_execbuffer_relocate(struct eb_vmas *eb, return ret; } -#define __EXEC_OBJECT_HAS_PIN (131) -#define __EXEC_OBJECT_HAS_FENCE (130) - static int need_reloc_mappable(struct i915_vma *vma) { @@ -552,26 +574,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma, return 0; } -static void -i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma) -{ - struct drm_i915_gem_exec_object2 *entry; - struct drm_i915_gem_object *obj = vma-obj; - - if (!drm_mm_node_allocated(vma-node)) - return; - - entry = vma-exec_entry; - - if (entry-flags __EXEC_OBJECT_HAS_FENCE) - i915_gem_object_unpin_fence(obj); - - if (entry-flags __EXEC_OBJECT_HAS_PIN) - i915_gem_object_unpin(obj); - - entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN); -} - static int i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, struct list_head *vmas, @@ -670,13 +672,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring, goto err; } -err: /* Decrement pin count for bound objects */ - list_for_each_entry(vma, vmas, exec_list) - i915_gem_execbuffer_unreserve_vma(vma); - +err: if (ret != -ENOSPC || retry++) return ret; + /* Decrement pin count for bound objects */ + list_for_each_entry(vma, vmas, exec_list) + i915_gem_execbuffer_unreserve_vma(vma); + ret = i915_gem_evict_vm(vm, true); if (ret) return ret; @@ -708,6 +711,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, while (!list_empty(eb-vmas)) { vma = list_first_entry(eb-vmas, struct i915_vma, exec_list); list_del_init(vma-exec_list); + i915_gem_execbuffer_unreserve_vma(vma); drm_gem_object_unreference(vma-obj-base); } -- 1.8.4.4 -- Ben Widawsky, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx