date:20131126

Re: [Intel-gfx] [PATCH 04/12] drm/i915: Don't unconditionally try to deref aliasing ppgtt

2013-11-26 Thread Daniel Vetter

On Mon, Nov 25, 2013 at 09:08:50PM +0200, Ville Syrjälä wrote:
 On Mon, Nov 25, 2013 at 10:10:28AM -0800, Ben Widawsky wrote:
  On Mon, Nov 25, 2013 at 06:06:18PM +, Chris Wilson wrote:
   On Mon, Nov 25, 2013 at 09:54:35AM -0800, Ben Widawsky wrote:
Since the beginning, the functions which try to properly reference the
aliasing PPGTT have deferences a potentially null aliasing_ppgtt member.
Since the accessors are meant to be global, this will not do.

Introduced originally in:
commit a70a3148b0c61cb7c588ea650db785b261b378a3
Author: Ben Widawsky b...@bwidawsk.net
Date:   Wed Jul 31 16:59:56 2013 -0700

drm/i915: Make proper functions for VMs

Signed-off-by: Ben Widawsky b...@bwidawsk.net
---
 drivers/gpu/drm/i915/i915_gem.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c 
b/drivers/gpu/drm/i915/i915_gem.c
index 40d9dcf..bc5c865 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4971,7 +4971,8 @@ unsigned long i915_gem_obj_offset(struct 
drm_i915_gem_object *o,
struct drm_i915_private *dev_priv = o-base.dev-dev_private;
struct i915_vma *vma;
 
-   if (vm == dev_priv-mm.aliasing_ppgtt-base)
+   if (!dev_priv-mm.aliasing_ppgtt ||
+   vm == dev_priv-mm.aliasing_ppgtt-base)
   
   Where's the dereference? gcc is smarter than your average bear.
   -Chris
  
  I had assumed: dev_priv-mm.aliasing_ppgtt-base in cases when
  aliasing_ppgtt was NULL. Given that I never actually hit this, I agree
  GCC must be doing something.
 
 Sounds like another discussion on the implementation of offsetof().
 
  
  Is such behavior documented somewhere? (forgive the lazy)
 
 IIRC the C standard does say that for *foo it's as if both  and *
 weren't there (and IIRC the same for foo[x]), but doesn't really
 say that kind of thing for foo-bar. I guess it's a gray area,
 and just happens work that way on certain compilers.
 
 In this particular case I think there's one slight issue. If
 aliasing_pggtt == NULL, and someone passes in vm == NULL by
 accident, then it'll take the branch and use ggtt because
 aliasing_ppgtt-base will be NULL too (due to base being at
 offset 0 in the struct). Now, I don't know if a NULL
 vm could survive this far into the code, but it it can, then it
 might make debugging a bit more fun.

If you pass in a NULL vm then you get to keep both pieces. And wrt
portability atm you can pick any compiler for the linux kernel as long as
it's gcc compatible. So imo nothing to really worry about.

Aside: We really should start to ditch all these (vm, obj) pair lookups
and move to using vmas everywhere ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 18/19] drm/i915: save some time when waiting the eDP timings

2013-11-26 Thread Chris Wilson

On Mon, Nov 25, 2013 at 06:38:53PM -0800, Ben Widawsky wrote:
 On Mon, Nov 25, 2013 at 11:25:10PM +, Chris Wilson wrote:
  On Mon, Nov 25, 2013 at 02:17:55PM -0800, Ben Widawsky wrote:
   On Thu, Nov 21, 2013 at 04:00:17PM +, Chris Wilson wrote:
On Thu, Nov 21, 2013 at 01:47:32PM -0200, Paulo Zanoni wrote:
 From: Paulo Zanoni paulo.r.zan...@intel.com
 
 The eDP spec defines some points where after you do action A, you have
 to wait some time before action B. The thing is that in our driver
 action B does not happen exactly after action A, but we still use
 msleep() calls directly. What this patch happens is that we record the
 timestamp of when action A happened, then, just before action B, we
 look at how much time has passed and only sleep the remaining amount
 needed.
 
 With this change, I am able to save about 5-20ms (out of the total
 200ms) of the backlight_off delay and completely skip the 1ms
 backlight_on delay. The 600ms vdd_off delay doesn't happen during
 normal usage anymore due to a previous patch.
 
 Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com
 ---
  drivers/gpu/drm/i915/intel_dp.c  | 38 
 +++---
  drivers/gpu/drm/i915/intel_drv.h |  3 +++
  2 files changed, 38 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_dp.c 
 b/drivers/gpu/drm/i915/intel_dp.c
 index b438e76..3a1ca80 100644
 --- a/drivers/gpu/drm/i915/intel_dp.c
 +++ b/drivers/gpu/drm/i915/intel_dp.c
 @@ -1051,12 +1051,41 @@ static void ironlake_wait_panel_off(struct 
 intel_dp *intel_dp)
   ironlake_wait_panel_status(intel_dp, IDLE_OFF_MASK, 
 IDLE_OFF_VALUE);
  }
  
 +static void ironlake_wait_jiffies_delay(unsigned long timestamp,
 + int to_wait_ms)
This is not hw specific, so just
intel_wait_until_after(timestamp_jiffies, to_wait_ms)
   
   Can't we do this with our existing wait_for, and get all the other junk
   we've crammed in there?
   wait_for(false, timestamp + to_wait_ms)
   
   Or do I have this all wrong?
  
  It would be
  wait_for(false, jiffies_to_ms(max(ms_to_jiffies(timestamp+to_wait_ms) - 
  jiffies, 0))
  or something pretty similar. Definitely macro abuse as you would be
  hoping that the compiler turned
while (!time_after(jiffies, timeout)) msleep(1);
  into
msleep(to_ms(timeout-jiffies));
  
  So perhaps clearer would be:
  intel_wait_until_after(unsigned long timestamp_jiffies,
 int to_wait_ms)
   {
 timestamp_jiffies += msec_to_jiffes(to_wait_ms) + 1;
 if (time_after(timestamp_jiffies, jiffies) {
timestamp_jiffies -= jiffies;
while (timestamp_jiffies)
timestamp_jiffies = schedule_timeout(timestamp_jiffies);
 }
   }
 
 Oops, I meant  timestamp + to_wait_ms - jiffies_to_msecs(jiffies)
 I agree it does get a bit messy, but I think there is likely a cleaner
 way that what you propose if we store things as jiffies.
 
 I hate dealing with jiffies, and I feel like even your function has a
 race with jiffies though.

Sure, I was contemplating using an extra variable to only access jiffies
once, but I was lazy. Not mixing units here would make it less likely to
misuse. However, I think the point stands that we can write a simple
function that is clearer than abusing wait_for(). We will wait and see
what Paulo creates. :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 10/12] drm/i915: Allow ggtt lookups to not WARN

2013-11-26 Thread Daniel Vetter

On Mon, Nov 25, 2013 at 09:54:41AM -0800, Ben Widawsky wrote:
 To be able to effectively use the GGTT object lookup function, we don't
 want to warn when there is no GGTT mapping. Let the caller deal with it
 instead.
 
 Originally, I had intended to have this behavior, and has not
 introduced the WARN. It was introduced during review with the addition
 of the follow commit
 
 commit 5c2abbeab798154166d42fce4f71790caa6dd9bc
 Author: Ben Widawsky benjamin.widaw...@intel.com
 Date:   Tue Sep 24 09:57:57 2013 -0700
 
 drm/i915: Provide a cheap ggtt vma lookup
 
 Signed-off-by: Ben Widawsky b...@bwidawsk.net
 ---
  drivers/gpu/drm/i915/i915_gem.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
 index bc5c865..6388706 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -5069,7 +5069,7 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct 
 drm_i915_gem_object *obj)
   return NULL;
  
   vma = list_first_entry(obj-vma_list, typeof(*vma), vma_link);
 - if (WARN_ON(vma-vm != obj_to_ggtt(obj)))
 + if (vma-vm != obj_to_ggtt(obj))

Makes sense, but then please add a must_check annotation to this function.
That will point you at the user in the context code which imo deserves a
BUG_ON or similar. I guess we should also convert over the little ggtt
helpers to use this function.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 11/12] drm/i915: Takedown drm_mm on failed gtt setup

2013-11-26 Thread Daniel Vetter

On Mon, Nov 25, 2013 at 09:54:42AM -0800, Ben Widawsky wrote:
 This was found by code inspection. If the GTT setup fails then we are
 left without properly tearing down the drm_mm.
 
 Hopefully this never happens.
 
 Signed-off-by: Ben Widawsky b...@bwidawsk.net
 ---
  drivers/gpu/drm/i915/i915_gem.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
 index 6388706..d85bc3e 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -4504,6 +4504,7 @@ int i915_gem_init(struct drm_device *dev)
   mutex_unlock(dev-struct_mutex);
   if (ret) {
   i915_gem_cleanup_aliasing_ppgtt(dev);
 + drm_mm_takedown(dev_priv-gtt.base.mm);

Sprinkling stuff all over the place feels bad, I guess our gem init code
is ripe for an overhaul. I've pulled this in anyway.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] VPG [BDW] drm/i915/bdw: DP aux channel control timeout change

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 01:41:57PM +0530, kirankumar wrote:
 changing DP AUX control timeout to 600 instead of 400 for BDW.

DP spec says 400 us, so a notch more context in the commit message would
be appreciated. E.g. if this is used to work around dp aux failures, then
the commit message _must_ mention this. I need this to asses in which
branch to put the patch.

Also we still have occasional dp aux woes on older platforms. Do we need a
similar timeout extension for those, too?

Remember people: The commit message must explain _why_ something needs to
change. If you feel like you need to explain the what of your change, your
code probably needs to be rewritten to be esier to understand.

Cheers, Daniel

 
 Change-Id: Id507ef35490e12bbd896492c9da24d0db367bf1a
 Signed-off-by: kirankumar kiran.s.ku...@intel.com
 ---
  drivers/gpu/drm/i915/intel_dp.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
 index ae2a057..953a3f6 100644
 --- a/drivers/gpu/drm/i915/intel_dp.c
 +++ b/drivers/gpu/drm/i915/intel_dp.c
 @@ -351,7 +351,7 @@ intel_dp_aux_ch(struct intel_dp *intel_dp,
   I915_WRITE(ch_ctl,
  DP_AUX_CH_CTL_SEND_BUSY |
  (has_aux_irq ? DP_AUX_CH_CTL_INTERRUPT : 0) |
 -DP_AUX_CH_CTL_TIME_OUT_400us |
 +DP_AUX_CH_CTL_TIME_OUT_600us |
  (send_bytes  DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT) |
  (precharge  DP_AUX_CH_CTL_PRECHARGE_2US_SHIFT) |
  (aux_clock_divider  
 DP_AUX_CH_CTL_BIT_CLOCK_2X_SHIFT) |
 -- 
 1.7.9.5
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Drop forcewake w/a for missed interrupts/seqno on Sandybridge

2013-11-26 Thread Daniel Vetter

On Fri, Nov 22, 2013 at 12:46:40PM -0800, Jesse Barnes wrote:
 On Fri, 22 Nov 2013 20:35:24 +
 Chris Wilson ch...@chris-wilson.co.uk wrote:
 
  I believe, and an evening of i-g-t, that our original workaround for the
  missed interrupts on Sandybridge, that of holding forcewake whilst we
  wait for an interrupts, is no longer required. This leaves us dependent
  on the second workaround of forcing an UC read of the ACTHD before
  reading back the seqno from the snooped HWS. Dropping the forcewake
  should allow us to conserve a little power, not much as the GPU is meant
  to be busy whilst we wait for it!
  
  Cc: Jesse Barnes jbar...@virtuousgeek.org
  Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
  ---
   drivers/gpu/drm/i915/intel_ringbuffer.c | 7 ---
   1 file changed, 7 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
  b/drivers/gpu/drm/i915/intel_ringbuffer.c
  index 80893af4062b..6b121ba9d3f7 100644
  --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
  +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
  @@ -1030,11 +1030,6 @@ gen6_ring_get_irq(struct intel_ring_buffer *ring)
  if (!dev-irq_enabled)
 return false;
   
  -   /* It looks like we need to prevent the gt from suspending while waiting
  -* for an notifiy irq, otherwise irqs seem to get lost on at least the
  -* blt/bsd rings on ivb. */
  -   gen6_gt_force_wake_get(dev_priv);
  -
  spin_lock_irqsave(dev_priv-irq_lock, flags);
  if (ring-irq_refcount++ == 0) {
  if (HAS_L3_DPF(dev)  ring-id == RCS)
  @@ -1066,8 +1061,6 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring)
  ilk_disable_gt_irq(dev_priv, ring-irq_enable_mask);
  }
  spin_unlock_irqrestore(dev_priv-irq_lock, flags);
  -
  -   gen6_gt_force_wake_put(dev_priv);
   }
   
   static bool
 
 Yay!  This is just what I was testing on BYT today, and it works there
 too (and gives me tons more RC6 residency).

Risky imo, since I see missed interrupts sporadically on all my gen6+
machines, and we don't have any solid testcase in igt that readily hits
them. Expect a swift strike with the revert hammer if this causes trouble
;-)

 Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/3] drm/i915/vlv: Valleyview support for forcewake Individual power wells.

2013-11-26 Thread S, Deepak

Hi Chris,

We are using single write fifo for the multiple power wells. Since  Valleyview 
as only supports only one write fifo. 

Thanks
Deepak

-Original Message-
From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] 
Sent: Monday, November 25, 2013 8:18 PM
To: S, Deepak
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915/vlv: Valleyview support for 
forcewake Individual power wells.

On Sat, Nov 23, 2013 at 02:55:43PM +0530, deepa...@intel.com wrote:
 From: Deepak S deepa...@intel.com
 
 Split vlv force wake routines to help individually control 
 Media/Render well based on the register access.

Do you mind clarifying if there if a single write fifo for the multiple power 
wells? Just something that worried me reading through the code.
Otherwise, lgtm.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Enabling DebugFS for valleyview forcewake counts

2013-11-26 Thread S, Deepak

Yes Jesse, this is a warning fix. 

-Original Message-
From: Jesse Barnes [mailto:jbar...@virtuousgeek.org] 
Sent: Monday, November 25, 2013 10:09 PM
To: S, Deepak
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 3/3] drm/i915: Enabling DebugFS for valleyview 
forcewake counts

On Sat, 23 Nov 2013 14:55:44 +0530
deepa...@intel.com wrote:
 @@ -2349,7 +2357,7 @@ static int pipe_crc_set_source(struct drm_device 
 *dev, enum pipe pipe,  {
   struct drm_i915_private *dev_priv = dev-dev_private;
   struct intel_pipe_crc *pipe_crc = dev_priv-pipe_crc[pipe];
 - u32 val;
 + u32 val = 0;
   int ret;
  
   if (pipe_crc-source == source)

Spurious warning fix?  Otherwise looks fine.

Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

--
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer

2013-11-26 Thread Chris Wilson

As the execbuffer dispatch grows ever more complex and involves multiple
stages of moving objects into the aperture, we need to take greater care
that we do not evict our execbuffer objects prior to dispatch. This is
relatively simple as we can just keep the objects pinned for not just
the relocation but until we are finished.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Ben Widawsky benjamin.widaw...@intel.com
Cc: Daniel Vetter dan...@ffwll.ch
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 --
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 885d595e0e02..b7e787fb4649 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,6 +33,9 @@
 #include intel_drv.h
 #include linux/dma_remapping.h
 
+#define  __EXEC_OBJECT_HAS_PIN (131)
+#define  __EXEC_OBJECT_HAS_FENCE (130)
+
 struct eb_vmas {
struct list_head vmas;
int and;
@@ -187,7 +190,28 @@ static struct i915_vma *eb_get_vma(struct eb_vmas *eb, 
unsigned long handle)
}
 }
 
-static void eb_destroy(struct eb_vmas *eb) {
+static void
+i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
+{
+   struct drm_i915_gem_exec_object2 *entry;
+   struct drm_i915_gem_object *obj = vma-obj;
+
+   if (!drm_mm_node_allocated(vma-node))
+   return;
+
+   entry = vma-exec_entry;
+
+   if (entry-flags  __EXEC_OBJECT_HAS_FENCE)
+   i915_gem_object_unpin_fence(obj);
+
+   if (entry-flags  __EXEC_OBJECT_HAS_PIN)
+   i915_gem_object_unpin(obj);
+
+   entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
+}
+
+static void eb_destroy(struct eb_vmas *eb)
+{
while (!list_empty(eb-vmas)) {
struct i915_vma *vma;
 
@@ -195,6 +219,7 @@ static void eb_destroy(struct eb_vmas *eb) {
   struct i915_vma,
   exec_list);
list_del_init(vma-exec_list);
+   i915_gem_execbuffer_unreserve_vma(vma);
drm_gem_object_unreference(vma-obj-base);
}
kfree(eb);
@@ -478,9 +503,6 @@ i915_gem_execbuffer_relocate(struct eb_vmas *eb,
return ret;
 }
 
-#define  __EXEC_OBJECT_HAS_PIN (131)
-#define  __EXEC_OBJECT_HAS_FENCE (130)
-
 static int
 need_reloc_mappable(struct i915_vma *vma)
 {
@@ -552,26 +574,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
return 0;
 }
 
-static void
-i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
-{
-   struct drm_i915_gem_exec_object2 *entry;
-   struct drm_i915_gem_object *obj = vma-obj;
-
-   if (!drm_mm_node_allocated(vma-node))
-   return;
-
-   entry = vma-exec_entry;
-
-   if (entry-flags  __EXEC_OBJECT_HAS_FENCE)
-   i915_gem_object_unpin_fence(obj);
-
-   if (entry-flags  __EXEC_OBJECT_HAS_PIN)
-   i915_gem_object_unpin(obj);
-
-   entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
-}
-
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
struct list_head *vmas,
@@ -670,13 +672,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer 
*ring,
goto err;
}
 
-err:   /* Decrement pin count for bound objects */
-   list_for_each_entry(vma, vmas, exec_list)
-   i915_gem_execbuffer_unreserve_vma(vma);
-
+err:
if (ret != -ENOSPC || retry++)
return ret;
 
+   /* Decrement pin count for bound objects */
+   list_for_each_entry(vma, vmas, exec_list)
+   i915_gem_execbuffer_unreserve_vma(vma);
+
ret = i915_gem_evict_vm(vm, true);
if (ret)
return ret;
@@ -708,6 +711,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
while (!list_empty(eb-vmas)) {
vma = list_first_entry(eb-vmas, struct i915_vma, exec_list);
list_del_init(vma-exec_list);
+   i915_gem_execbuffer_unreserve_vma(vma);
drm_gem_object_unreference(vma-obj-base);
}
 
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer

2013-11-26 Thread Chris Wilson

On Tue, Nov 26, 2013 at 11:23:15AM +, Chris Wilson wrote:
 As the execbuffer dispatch grows ever more complex and involves multiple
 stages of moving objects into the aperture, we need to take greater care
 that we do not evict our execbuffer objects prior to dispatch. This is
 relatively simple as we can just keep the objects pinned for not just
 the relocation but until we are finished.

One such example is the possibility of the context switch causing an
eviction or hitting the shrinker in order to fit its object into the
aperture.

Link: http://lists.freedesktop.org/archives/intel-gfx/2013-November/036166.html
Reported-by: Siluvery, Arun arun.siluv...@intel.com

 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Ben Widawsky benjamin.widaw...@intel.com
 Cc: Daniel Vetter dan...@ffwll.ch
 Cc: sta...@vger.kernel.org

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2

2013-11-26 Thread Chris Wilson

On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote:
 And move it up in the file for earlier usage.
 
 v2: add pre-gen4 support as well (Chris)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4

2013-11-26 Thread Chris Wilson

On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote:
 Read out the current plane configuration at init time into a new
 plane_config structure.  This allows us to track any existing
 framebuffers attached to the plane and potentially re-use them in our
 fbdev code for a smooth handoff.
 
 v2: update for new pitch_for_width function (Jesse)
 comment how get_plane_config works with shared fbs (Jesse)
 v3: s/ARGB/XRGB (Ville)
 use pipesrc width/height (Ville)
 fix fourcc comment (Bob)
 use drm_format_plane_cpp (Ville)
 v4: use fb for tracking fb data object (Ville)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 ---

 @@ -10879,6 +10992,18 @@ void intel_modeset_init(struct drm_device *dev)
  
   /* Just in case the BIOS is doing something questionable. */
   intel_disable_fbc(dev);
 +
 + intel_modeset_setup_hw_state(dev, false);
 +
 + list_for_each_entry(crtc, dev-mode_config.crtc_list,
 + base.head) {
 + if (!crtc-active)
 + continue;
 +
 + if (dev_priv-display.get_plane_config)
 + dev_priv-display.get_plane_config(crtc,
 +crtc-plane_config);

The trick is that here if we do not retreive the current config,
including the preallocated fb, we *must* disable the output. In this
case, it would be a step in intel_sanitize_crtc() to disable the CRTC if
it is enabled but we have no preserved fb.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7

2013-11-26 Thread Chris Wilson

On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote:
 Retrieve current framebuffer config info from the regs and create an fb
 object for the buffer the BIOS or boot loader left us.  This should
 allow for smooth transitions to userspace apps once we finish the
 initial configuration construction.
 
 v2: check for non-native modes and adjust (Jesse)
 fixup aperture and cmap frees (Imre)
 use unlocked unref if init_bios fails (Jesse)
 fix curly brace around DSPADDR check (Imre)
 comment failure path for pin_and_fence (Imre)
 v3: fixup fixup of aperture frees (Chris)
 v4: update to current bits (locking  pin_and_fence hack) (Jesse)
 v5: move fb config fetch to display code (Jesse)
 re-order hw state readout on initial load to suit fb inherit (Jesse)
 re-add pin_and_fence in fbdev code to make sure we refcount properly (Je
 v6: rename to plane_config (Daniel)
 check for valid object when initializing BIOS fb (Jesse)
 split from plane_config readout and other display changes (Jesse)
 drop use_bios_fb option (Chris)
 update comments (Jesse)
 rework fbdev_init_bios for clarity (Jesse)
 drop fb obj ref under lock (Chris)
 v7: use fb object from plane_config instead (Ville)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org

Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding
the kzalloc(intel_fbdev) and the clarity of
intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb
lifetime of plane_config-fb is extremely ugly though (the theft could
be made a little more obvious for instance) and still leaked upon failure?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 5/5] drm/i915: don't memset the fb buffer if preallocated

2013-11-26 Thread Chris Wilson

On Mon, Nov 25, 2013 at 03:51:19PM -0800, Jesse Barnes wrote:
 We want to preserve the BIOS/bootloader contents for later.
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org

Decided this was better than my bikeshed,
Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: check context reset stats before relocations

2013-11-26 Thread Mika Kuoppala

Doing it early prevents moving and relocating objects in vain
for contexts that won't get any GPU time.

Reported-by: Chris Wilson ch...@chris-wilson.co.uk
Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   38 ++--
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 885d595..61c716d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -885,6 +885,24 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
return 0;
 }
 
+static int
+i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
+ const u32 ctx_id)
+{
+   struct i915_ctx_hang_stats *hs;
+
+   hs = i915_gem_context_get_hang_stats(dev, file, ctx_id);
+   if (IS_ERR(hs))
+   return PTR_ERR(hs);
+
+   if (hs-banned) {
+   DRM_DEBUG(Context %u tried to submit while banned\n, ctx_id);
+   return -EIO;
+   }
+
+   return 0;
+}
+
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *vmas,
   struct intel_ring_buffer *ring)
@@ -964,8 +982,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
struct drm_i915_gem_object *batch_obj;
struct drm_clip_rect *cliprects = NULL;
struct intel_ring_buffer *ring;
-   struct i915_ctx_hang_stats *hs;
-   u32 ctx_id = i915_execbuffer2_get_context_id(*args);
+   const u32 ctx_id = i915_execbuffer2_get_context_id(*args);
u32 exec_start, exec_len;
u32 mask, flags;
int ret, mode, i;
@@ -1102,6 +1119,12 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
*data,
goto pre_mutex_err;
}
 
+   ret = i915_gem_validate_context(dev, file, ctx_id);
+   if (ret) {
+   mutex_unlock(dev-struct_mutex);
+   goto pre_mutex_err;
+   }
+
eb = eb_create(args, vm);
if (eb == NULL) {
mutex_unlock(dev-struct_mutex);
@@ -1154,17 +1177,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void 
*data,
if (ret)
goto err;
 
-   hs = i915_gem_context_get_hang_stats(dev, file, ctx_id);
-   if (IS_ERR(hs)) {
-   ret = PTR_ERR(hs);
-   goto err;
-   }
-
-   if (hs-banned) {
-   ret = -EIO;
-   goto err;
-   }
-
ret = i915_switch_context(ring, file, ctx_id);
if (ret)
goto err;
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: check context reset stats before relocations

2013-11-26 Thread Chris Wilson

On Tue, Nov 26, 2013 at 04:14:33PM +0200, Mika Kuoppala wrote:
 Doing it early prevents moving and relocating objects in vain
 for contexts that won't get any GPU time.
 
 Reported-by: Chris Wilson ch...@chris-wilson.co.uk
 Signed-off-by: Mika Kuoppala mika.kuopp...@intel.com
Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 18/19] drm/i915: save some time when waiting the eDP timings

2013-11-26 Thread Paulo Zanoni

2013/11/26 Chris Wilson ch...@chris-wilson.co.uk:
 On Mon, Nov 25, 2013 at 06:38:53PM -0800, Ben Widawsky wrote:
 On Mon, Nov 25, 2013 at 11:25:10PM +, Chris Wilson wrote:
  On Mon, Nov 25, 2013 at 02:17:55PM -0800, Ben Widawsky wrote:
   On Thu, Nov 21, 2013 at 04:00:17PM +, Chris Wilson wrote:
On Thu, Nov 21, 2013 at 01:47:32PM -0200, Paulo Zanoni wrote:
 From: Paulo Zanoni paulo.r.zan...@intel.com

 The eDP spec defines some points where after you do action A, you 
 have
 to wait some time before action B. The thing is that in our driver
 action B does not happen exactly after action A, but we still use
 msleep() calls directly. What this patch happens is that we record 
 the
 timestamp of when action A happened, then, just before action B, we
 look at how much time has passed and only sleep the remaining amount
 needed.

 With this change, I am able to save about 5-20ms (out of the total
 200ms) of the backlight_off delay and completely skip the 1ms
 backlight_on delay. The 600ms vdd_off delay doesn't happen during
 normal usage anymore due to a previous patch.

 Signed-off-by: Paulo Zanoni paulo.r.zan...@intel.com
 ---
  drivers/gpu/drm/i915/intel_dp.c  | 38 
 +++---
  drivers/gpu/drm/i915/intel_drv.h |  3 +++
  2 files changed, 38 insertions(+), 3 deletions(-)

 diff --git a/drivers/gpu/drm/i915/intel_dp.c 
 b/drivers/gpu/drm/i915/intel_dp.c
 index b438e76..3a1ca80 100644
 --- a/drivers/gpu/drm/i915/intel_dp.c
 +++ b/drivers/gpu/drm/i915/intel_dp.c
 @@ -1051,12 +1051,41 @@ static void ironlake_wait_panel_off(struct 
 intel_dp *intel_dp)
   ironlake_wait_panel_status(intel_dp, IDLE_OFF_MASK, 
 IDLE_OFF_VALUE);
  }

 +static void ironlake_wait_jiffies_delay(unsigned long timestamp,
 + int to_wait_ms)
This is not hw specific, so just
intel_wait_until_after(timestamp_jiffies, to_wait_ms)
  
   Can't we do this with our existing wait_for, and get all the other junk
   we've crammed in there?
   wait_for(false, timestamp + to_wait_ms)
  
   Or do I have this all wrong?
 
  It would be
  wait_for(false, jiffies_to_ms(max(ms_to_jiffies(timestamp+to_wait_ms) - 
  jiffies, 0))
  or something pretty similar. Definitely macro abuse as you would be
  hoping that the compiler turned
while (!time_after(jiffies, timeout)) msleep(1);
  into
msleep(to_ms(timeout-jiffies));
 
  So perhaps clearer would be:
  intel_wait_until_after(unsigned long timestamp_jiffies,
 int to_wait_ms)
   {
 timestamp_jiffies += msec_to_jiffes(to_wait_ms) + 1;
 if (time_after(timestamp_jiffies, jiffies) {
timestamp_jiffies -= jiffies;
while (timestamp_jiffies)
timestamp_jiffies = schedule_timeout(timestamp_jiffies);
 }
   }

 Oops, I meant  timestamp + to_wait_ms - jiffies_to_msecs(jiffies)
 I agree it does get a bit messy, but I think there is likely a cleaner
 way that what you propose if we store things as jiffies.

 I hate dealing with jiffies, and I feel like even your function has a
 race with jiffies though.

 Sure, I was contemplating using an extra variable to only access jiffies
 once, but I was lazy. Not mixing units here would make it less likely to
 misuse. However, I think the point stands that we can write a simple
 function that is clearer than abusing wait_for(). We will wait and see
 what Paulo creates. :)

I really don't think abusing wait_for is a good idea. I think wait_for
has a different purpose, and trying to invent clever ways to use it on
a non-standard use case will just make the code hard to read, without
any benefits. And with the risk that future changes to wait_for will
break our code.

Also, I think Chris' solution is very similar to my solution, but with
an open-coded msleep() instead of just the simple msleep() call.

So I'd really vote to apply the KISS design pattern and stick to the
original implementation + Chris' first suggestions (rename the
function, fix the msleep call), unless we can find another bug on it.
But since applying whatever bikesheds we get is always the fastest
(only?) way to gets patches merged, if you insist I will just do
whatever is requested.

Thanks,
Paulo

 -Chris

 --
 Chris Wilson, Intel Open Source Technology Centre



-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 13:57:10 +
Chris Wilson ch...@chris-wilson.co.uk wrote:

 On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote:
  Read out the current plane configuration at init time into a new
  plane_config structure.  This allows us to track any existing
  framebuffers attached to the plane and potentially re-use them in our
  fbdev code for a smooth handoff.
  
  v2: update for new pitch_for_width function (Jesse)
  comment how get_plane_config works with shared fbs (Jesse)
  v3: s/ARGB/XRGB (Ville)
  use pipesrc width/height (Ville)
  fix fourcc comment (Bob)
  use drm_format_plane_cpp (Ville)
  v4: use fb for tracking fb data object (Ville)
  
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
  ---
 
  @@ -10879,6 +10992,18 @@ void intel_modeset_init(struct drm_device *dev)
   
  /* Just in case the BIOS is doing something questionable. */
  intel_disable_fbc(dev);
  +
  +   intel_modeset_setup_hw_state(dev, false);
  +
  +   list_for_each_entry(crtc, dev-mode_config.crtc_list,
  +   base.head) {
  +   if (!crtc-active)
  +   continue;
  +
  +   if (dev_priv-display.get_plane_config)
  +   dev_priv-display.get_plane_config(crtc,
  +  crtc-plane_config);
 
 The trick is that here if we do not retreive the current config,
 including the preallocated fb, we *must* disable the output. In this
 case, it would be a step in intel_sanitize_crtc() to disable the CRTC if
 it is enabled but we have no preserved fb.

Yeah, but I thought since that's been broken for awhile, it should come
in as a separate bug fix.  I can do a patch on top of this if the rest
looks ok.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 14:09:48 +
Chris Wilson ch...@chris-wilson.co.uk wrote:

 On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote:
  Retrieve current framebuffer config info from the regs and create an fb
  object for the buffer the BIOS or boot loader left us.  This should
  allow for smooth transitions to userspace apps once we finish the
  initial configuration construction.
  
  v2: check for non-native modes and adjust (Jesse)
  fixup aperture and cmap frees (Imre)
  use unlocked unref if init_bios fails (Jesse)
  fix curly brace around DSPADDR check (Imre)
  comment failure path for pin_and_fence (Imre)
  v3: fixup fixup of aperture frees (Chris)
  v4: update to current bits (locking  pin_and_fence hack) (Jesse)
  v5: move fb config fetch to display code (Jesse)
  re-order hw state readout on initial load to suit fb inherit (Jesse)
  re-add pin_and_fence in fbdev code to make sure we refcount properly (Je
  v6: rename to plane_config (Daniel)
  check for valid object when initializing BIOS fb (Jesse)
  split from plane_config readout and other display changes (Jesse)
  drop use_bios_fb option (Chris)
  update comments (Jesse)
  rework fbdev_init_bios for clarity (Jesse)
  drop fb obj ref under lock (Chris)
  v7: use fb object from plane_config instead (Ville)
  
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 
 Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding
 the kzalloc(intel_fbdev) and the clarity of
 intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb
 lifetime of plane_config-fb is extremely ugly though (the theft could
 be made a little more obvious for instance) and still leaked upon failure?

I free it on failure in fb_init_bios, but maybe I missed an error path?

I agree on the lifetime handling, but it was either this or an ugly
copy of the fb object in the bios function.  And even then we'd have to
find a place to free it that made sense (there is no such place today).

I did comment the field in the struct at least...

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 01/22] drm/i915: Add data structures for command parser

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

The command parser needs to know a few things about certain commands
in order to process them correctly. Add structures for storing that
information.

OTC-Tracker: AXIA-4631
Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_drv.h | 51 +
 1 file changed, 51 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 14f250a..ff1e201 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1731,6 +1731,57 @@ struct drm_i915_file_private {
atomic_t rps_wait_boost;
 };
 
+/**
+ * A command that requires special handling by the command parser.
+ */
+struct drm_i915_cmd_descriptor {
+   /**
+* Flags describing how the command parser processes the command.
+*
+* CMD_DESC_FIXED: The command has a fixed length if this is set,
+* a length mask if not set
+* CMD_DESC_SKIP: The command is allowed but does not follow the
+*standard length encoding for the opcode range in
+*which it falls
+*/
+   u32 flags;
+#define CMD_DESC_FIXED (10)
+#define CMD_DESC_SKIP  (11)
+
+   /**
+* The command's unique identification bits and the bitmask to get them.
+* This isn't strictly the opcode field as defined in the spec and may
+* also include type, subtype, and/or subop fields.
+*/
+   struct {
+   u32 value;
+   u32 mask;
+   } cmd;
+
+   /**
+* The command's length. The command is either fixed length (i.e. does
+* not include a length field) or has a length field mask. The flag
+* CMD_DESC_FIXED indicates a fixed length. Otherwise, the command has
+* a length mask. All command entries in a command table must include
+* length information.
+*/
+   union {
+   u32 fixed;
+   u32 mask;
+   } length;
+};
+
+/**
+ * A table of commands requiring special handling by the command parser.
+ *
+ * Each ring has an array of tables. Each table consists of an array of command
+ * descriptors, which must be sorted with command opcodes in ascending order.
+ */
+struct drm_i915_cmd_table {
+   const struct drm_i915_cmd_descriptor *table;
+   int count;
+};
+
 #define INTEL_INFO(dev)(to_i915(dev)-info)
 
 #define IS_I830(dev)   ((dev)-pdev-device == 0x3577)
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 04/22] drm/i915: Add per-ring command length decode functions

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

For commands that aren't in the parser's tables, we get the length
based on standard per-ring command encodings for specific opcode ranges.

These functions just return the bitmask and the parser will extract the
actual length value.

OTC-Tracker: AXIA-4631
Change-Id: I2729d4483931cb4aea9403fd43710c4d4e8e5e89
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 62 +
 drivers/gpu/drm/i915/intel_ringbuffer.h | 12 +++
 2 files changed, 74 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 014e661..247d530 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -137,6 +137,62 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
{ blt_cmds, ARRAY_SIZE(blt_cmds) },
 };
 
+#define CLIENT_MASK  0xE000
+#define SUBCLIENT_MASK   0x1800
+#define MI_CLIENT0x
+#define RC_CLIENT0x6000
+#define BC_CLIENT0x4000
+#define MEDIA_SUBCLIENT  0x1000
+
+static u32 gen7_render_get_cmd_length_mask(u32 cmd_header)
+{
+   u32 client = cmd_header  CLIENT_MASK;
+   u32 subclient = cmd_header  SUBCLIENT_MASK;
+
+   if (client == MI_CLIENT)
+   return 0x3F;
+   else if (client == RC_CLIENT) {
+   if (subclient == MEDIA_SUBCLIENT)
+   return 0x;
+   else
+   return 0xFF;
+   }
+
+   DRM_DEBUG_DRIVER(CMD: Abnormal rcs cmd length! 0x%08X\n, cmd_header);
+   return 0;
+}
+
+static u32 gen7_bsd_get_cmd_length_mask(u32 cmd_header)
+{
+   u32 client = cmd_header  CLIENT_MASK;
+   u32 subclient = cmd_header  SUBCLIENT_MASK;
+
+   if (client == MI_CLIENT)
+   return 0x3F;
+   else if (client == RC_CLIENT) {
+   if (subclient == MEDIA_SUBCLIENT)
+   return 0xFFF;
+   else
+   return 0xFF;
+   }
+
+   DRM_DEBUG_DRIVER(CMD: Abnormal bsd cmd length! 0x%08X\n, cmd_header);
+   return 0;
+}
+
+static u32 gen7_blt_get_cmd_length_mask(u32 cmd_header)
+{
+   u32 client = cmd_header  CLIENT_MASK;
+
+   if (client == MI_CLIENT)
+   return 0x3F;
+   else if (client == BC_CLIENT)
+   return 0xFF;
+
+   DRM_DEBUG_DRIVER(CMD: Abnormal blt cmd length! 0x%08X\n, cmd_header);
+   return 0;
+}
+
 void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
 {
if (!IS_GEN7(ring-dev))
@@ -152,18 +208,24 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
ring-cmd_tables = gen7_render_cmds;
ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
}
+
+   ring-get_cmd_length_mask = gen7_render_get_cmd_length_mask;
break;
case VCS:
ring-cmd_tables = gen7_video_cmds;
ring-cmd_table_count = ARRAY_SIZE(gen7_video_cmds);
+   ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
case BCS:
ring-cmd_tables = gen7_blt_cmds;
ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+   ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
break;
case VECS:
ring-cmd_tables = hsw_vebox_cmds;
ring-cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds);
+   /* VECS can use the same length_mask function as VCS */
+   ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
default:
DRM_DEBUG(CMD: cmd_parser_init with unknown ring: %d\n,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 67305d3..8e71b59 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -169,6 +169,18 @@ struct  intel_ring_buffer {
 */
const struct drm_i915_cmd_table *cmd_tables;
int cmd_table_count;
+
+   /**
+* Returns the bitmask for the length field of the specified command.
+* Return 0 for an unrecognized/invalid command.
+*
+* If the command parser finds an entry for a command in the ring's
+* cmd_tables, it gets the command's length based on the table entry.
+* If not, it calls this function to determine the per-ring length field
+* encoding for the command (i.e. certain opcode ranges use certain bits
+* to encode the command length in the header).
+*/
+   u32 (*get_cmd_length_mask)(u32 cmd_header);
 };
 
 static inline bool
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 15/22] drm/i915: Reject commands that would store to global HWS page

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

PIPE_CONTROL and MI_FLUSH_DW have bits that would write to the
hardware status page. There are no users of this today and it
seems unsafe.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++
 drivers/gpu/drm/i915/i915_reg.h|  1 +
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 7b30a03..f32dc69 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -131,7 +131,8 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = 
{
  },
  {
.offset = 1,
-   .mask = PIPE_CONTROL_GLOBAL_GTT_IVB,
+   .mask = (PIPE_CONTROL_GLOBAL_GTT_IVB |
+PIPE_CONTROL_STORE_DATA_INDEX),
.expected = 0,
.condition_offset = 1,
.condition_mask = PIPE_CONTROL_POST_SYNC_OP_MASK
@@ -167,8 +168,15 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = 
{
.expected = 0,
.condition_offset = 0,
.condition_mask = MI_FLUSH_DW_OP_MASK
+ },
+ {
+   .offset = 0,
+   .mask = MI_FLUSH_DW_STORE_INDEX,
+   .expected = 0,
+   .condition_offset = 0,
+   .condition_mask = MI_FLUSH_DW_OP_MASK
  }},
- .bits_count = 2  ),
+ .bits_count = 3  ),
CMD(  MI_CONDITIONAL_BATCH_BUFFER_END,  SMI,   !F,  0xFF,   B,
  .bits = {{
.offset = 0,
@@ -192,8 +200,15 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
.expected = 0,
.condition_offset = 0,
.condition_mask = MI_FLUSH_DW_OP_MASK
+ },
+ {
+   .offset = 0,
+   .mask = MI_FLUSH_DW_STORE_INDEX,
+   .expected = 0,
+   .condition_offset = 0,
+   .condition_mask = MI_FLUSH_DW_OP_MASK
  }},
- .bits_count = 2  ),
+ .bits_count = 3  ),
CMD(  MI_CONDITIONAL_BATCH_BUFFER_END,  SMI,   !F,  0xFF,   B,
  .bits = {{
.offset = 0,
@@ -217,8 +232,15 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
.expected = 0,
.condition_offset = 0,
.condition_mask = MI_FLUSH_DW_OP_MASK
+ },
+ {
+   .offset = 0,
+   .mask = MI_FLUSH_DW_STORE_INDEX,
+   .expected = 0,
+   .condition_offset = 0,
+   .condition_mask = MI_FLUSH_DW_OP_MASK
  }},
- .bits_count = 2  ),
+ .bits_count = 3  ),
CMD(  COLOR_BLT,S2D,   !F,  0x3F,   S  ),
CMD(  SRC_COPY_BLT, S2D,   !F,  0x3F,   S  ),
 };
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 3f64d41..919d1a6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -323,6 +323,7 @@
 #define GFX_OP_PIPE_CONTROL(len)   ((0x329)|(0x327)|(0x224)|(len-2))
 #define   PIPE_CONTROL_GLOBAL_GTT_IVB  (124) /* gen7+ */
 #define   PIPE_CONTROL_MMIO_WRITE  (123)
+#define   PIPE_CONTROL_STORE_DATA_INDEX(121)
 #define   PIPE_CONTROL_CS_STALL(120)
 #define   PIPE_CONTROL_TLB_INVALIDATE  (118)
 #define   PIPE_CONTROL_QW_WRITE(114)
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 18/22] drm/i915: Reject MI_ARB_ON_OFF on VECS

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index c8426af..5593740 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -197,6 +197,7 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = {
 };
 
 static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
+   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
  .bits = {{
.offset = 0,
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 06/22] drm/i915: Add a HAS_CMD_PARSER getparam

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

So userspace can query the kernel for command parser support.

OTC-Tracker: AXIA-4631
Change-Id: I58af650db9f6753c2dcac9c54ab432fd31db302f
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_dma.c | 3 +++
 include/uapi/drm/i915_drm.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 5aeb103..f0a4638 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1003,6 +1003,9 @@ static int i915_getparam(struct drm_device *dev, void 
*data,
case I915_PARAM_HAS_EXEC_HANDLE_LUT:
value = 1;
break;
+   case I915_PARAM_HAS_CMD_PARSER:
+   value = 1;
+   break;
default:
DRM_DEBUG(Unknown parameter %d\n, param-param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 52aed89..48cc277 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -337,6 +337,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_EXEC_NO_RELOC25
 #define I915_PARAM_HAS_EXEC_HANDLE_LUT   26
 #define I915_PARAM_HAS_WT   27
+#define I915_PARAM_HAS_CMD_PARSER   28
 
 typedef struct drm_i915_getparam {
int param;
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 09/22] drm/i915: Add support for rejecting commands via bitmasks

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

A variety of checks we want to do amount to verifying that a given
bit or bits are set/clear in a given dword of a command. For now,
allow a small but arbitrary number of bitmasks for each command.

OTC-Tracker: AXIA-4631
Change-Id: Icc77316c243b6e218774c15e2c090cc470d59317
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 22 ++
 drivers/gpu/drm/i915/i915_drv.h| 16 
 2 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 2dbca01..99d15f3 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -400,6 +400,28 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
}
}
 
+   if (desc-flags  CMD_DESC_BITMASK) {
+   int i;
+
+   for (i = 0; i  desc-bits_count; i++) {
+   u32 dword = cmd[desc-bits[i].offset] 
+   desc-bits[i].mask;
+
+   if (dword != desc-bits[i].expected) {
+   DRM_DEBUG_DRIVER(CMD: Rejected command 
0x%08X for bitmask 0x%08X (exp=0x%08X act=0x%08X) (ring=%d)\n,
+*cmd,
+desc-bits[i].mask,
+desc-bits[i].expected,
+dword, ring-id);
+   ret = -EINVAL;
+   break;
+   }
+   }
+
+   if (ret)
+   break;
+   }
+
cmd += length;
}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 83b6031..f31fc68 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1752,6 +1752,7 @@ struct drm_i915_cmd_descriptor {
 #define CMD_DESC_SKIP (11)
 #define CMD_DESC_REJECT   (12)
 #define CMD_DESC_REGISTER (13)
+#define CMD_DESC_BITMASK  (14)
 
/**
 * The command's unique identification bits and the bitmask to get them.
@@ -1784,6 +1785,21 @@ struct drm_i915_cmd_descriptor {
u32 offset;
u32 mask;
} reg;
+
+#define MAX_CMD_DESC_BITMASKS 3
+   /**
+* Describes command checks where a particular dword is masked and
+* compared against an expected value. If the command does not match
+* the expected value, the parser rejects it. Only valid if flags has
+* the CMD_DESC_BITMASK bit set.
+*/
+   struct {
+   u32 offset;
+   u32 mask;
+   u32 expected;
+   } bits[MAX_CMD_DESC_BITMASKS];
+   /** Number of valid entries in the bits array */
+   int bits_count;
 };
 
 /**
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 14/22] drm/i915: Enable PPGTT command parser checks

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Various commands that access memory have a bit to determine whether
the graphics address specified in the command should use the GGTT or
PPGTT for translation. These checks ensure that the bit indicates
PPGTT translation.

Most of these checks use the existing bit-checking infrastructure.
The PIPE_CONTROL and MI_FLUSH_DW commands, however, are multi-function
commands. The GGTT/PPGTT bit is only relevant for certain uses of the
command. As such, this change also extends the bit-checking code to
include a condition mask and offset. If the condition mask is non-zero
then the parser only performs the bit check when the bits specified by
the condition mask/offset are also non-zero.

NOTE: At this point in the series PPGTT must be enabled for the parser
to work correctly. If it's not enabled, userspace will not be setting
the PPGTT bits the way the parser requires. There's a WARN_ON to detect
this case.

OTC-Tracker: AXIA-4631
Change-Id: I3f4c76b6734f1956ec47e698230f97d0998ff92b
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 110 ++---
 drivers/gpu/drm/i915/i915_drv.h|   6 ++
 drivers/gpu/drm/i915/i915_reg.h|   5 ++
 3 files changed, 111 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index b881d39..7b30a03 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -61,15 +61,33 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = 
{
CMD(  MI_REPORT_HEAD,   SMI,F,  1,  S  ),
CMD(  MI_SUSPEND_FLUSH, SMI,F,  1,  S  ),
CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   R  ),
-   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  S  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   R  ),
CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   W,
  .reg = { .offset = 1, .mask = 0x007C }   ),
CMD(  MI_UPDATE_GTT,SMI,   !F,  0xFF,   R  ),
-   CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   W,
- .reg = { .offset = 1, .mask = 0x007C }   ),
-   CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   W,
- .reg = { .offset = 1, .mask = 0x007C }   ),
+   CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   W | B,
+ .reg = { .offset = 1, .mask = 0x007C },
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
+   CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   W | B,
+ .reg = { .offset = 1, .mask = 0x007C },
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_BATCH_BUFFER_START,SMI,   !F,  0xFF,   S  ),
 };
 
@@ -79,7 +97,20 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = {
CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
CMD(  MI_PREDICATE, SMI,F,  1,  S  ),
CMD(  MI_TOPOLOGY_FILTER,   SMI,F,  1,  S  ),
-   CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  S  ),
+   CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
+   CMD(  MI_CONDITIONAL_BATCH_BUFFER_END,  SMI,   !F,  0xFF,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  GFX_OP_3DSTATE_VF_STATISTICS, S3D,F,  1,  S  ),
CMD(  PIPELINE_SELECT,  S3D,F,  1,  S  ),
CMD(  MEDIA_VFE_STATE,  S3D,   !F,  0x, B,
@@ -97,8 +128,15 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = 
{
.offset = 1,

[Intel-gfx] [RFC 07/22] drm/i915: Add support for rejecting commands during parsing

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Certain commands are always disallowed from userspace. This adds
the ability for the command parser to detect such commands and
reject batch buffers containing them.

OTC-Tracker: AXIA-4631
Change-Id: I000b0df4d441ec80b607a50d35e83418cdfd38b3
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 6 ++
 drivers/gpu/drm/i915/i915_drv.h| 6 --
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index b01628e..c64f640 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -368,6 +368,12 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
break;
}
 
+   if (desc-flags  CMD_DESC_REJECT) {
+   DRM_DEBUG_DRIVER(CMD: Rejected command: 0x%08X\n, 
*cmd);
+   ret = -EINVAL;
+   break;
+   }
+
cmd += length;
}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 81ef047..6ace856 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1743,10 +1743,12 @@ struct drm_i915_cmd_descriptor {
 * CMD_DESC_SKIP: The command is allowed but does not follow the
 *standard length encoding for the opcode range in
 *which it falls
+* CMD_DESC_REJECT: The command is never allowed
 */
u32 flags;
-#define CMD_DESC_FIXED (10)
-#define CMD_DESC_SKIP  (11)
+#define CMD_DESC_FIXED  (10)
+#define CMD_DESC_SKIP   (11)
+#define CMD_DESC_REJECT (12)
 
/**
 * The command's unique identification bits and the bitmask to get them.
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 10/22] drm/i915: Reject unsafe commands

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

These commands allow userspace to affect global state.

OTC-Tracker: AXIA-4631
Change-Id: I80a22c9cd83181790d2a9064e70ea09326691b66
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 99d15f3..8ee4cda 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -47,6 +47,7 @@
 #define SMFX STD_MFX_OPCODE_MASK
 #define F CMD_DESC_FIXED
 #define S CMD_DESC_SKIP
+#define R CMD_DESC_REJECT
 
 /*Command  Mask   Fixed Len   Action
  -- */
@@ -57,10 +58,11 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = 
{
CMD(  MI_ARB_CHECK, SMI,F,  1,  S  ),
CMD(  MI_REPORT_HEAD,   SMI,F,  1,  S  ),
CMD(  MI_SUSPEND_FLUSH, SMI,F,  1,  S  ),
-   CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   R  ),
CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  S  ),
-   CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   R  ),
CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_UPDATE_GTT,SMI,   !F,  0xFF,   R  ),
CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   S  ),
CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   S  ),
CMD(  MI_BATCH_BUFFER_START,SMI,   !F,  0xFF,   S  ),
@@ -68,8 +70,8 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = {
 
 static const struct drm_i915_cmd_descriptor render_cmds[] = {
CMD(  MI_FLUSH, SMI,F,  1,  S  ),
-   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  S  ),
-   CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
+   CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
CMD(  MI_PREDICATE, SMI,F,  1,  S  ),
CMD(  MI_TOPOLOGY_FILTER,   SMI,F,  1,  S  ),
CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  S  ),
@@ -94,12 +96,12 @@ static const struct drm_i915_cmd_descriptor 
hsw_render_cmds[] = {
 };
 
 static const struct drm_i915_cmd_descriptor video_cmds[] = {
-   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  S  ),
+   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
CMD(  MFX_WAIT, SMFX,  !F,  0x3F,   S  ),
 };
 
 static const struct drm_i915_cmd_descriptor blt_cmds[] = {
-   CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
CMD(  COLOR_BLT,S2D,   !F,  0x3F,   S  ),
CMD(  SRC_COPY_BLT, S2D,   !F,  0x3F,   S  ),
 };
@@ -111,6 +113,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
 #undef SMFX
 #undef F
 #undef S
+#undef R
 
 static const struct drm_i915_cmd_table gen7_render_cmds[] = {
{ common_cmds, ARRAY_SIZE(common_cmds) },
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 22/22] drm/i915: Enable command parsing by default

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

OTC-Tracker: AXIA-4631
Change-Id: I6747457e1fe7494bd42787af51198fcba398ad78
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 90d7db0..8c0d91b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -154,10 +154,10 @@ module_param_named(prefault_disable, 
i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
Disable page prefaulting for pread/pwrite/reloc 
(default:false). For developers only.);
 
-int i915_enable_cmd_parser __read_mostly = 0;
+int i915_enable_cmd_parser __read_mostly = 1;
 module_param_named(enable_cmd_parser, i915_enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
-   Enable command parsing (default: false));
+   Enable command parsing (default: true));
 
 static struct drm_driver driver;
 
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 21/22] drm/i915: Clean up command parser enable decision

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 30 +++---
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 8481ef0..b3525ce 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -581,6 +581,25 @@ finish:
return (u32*)addr;
 }
 
+int i915_needs_cmd_parser(struct intel_ring_buffer *ring)
+{
+   drm_i915_private_t *dev_priv =
+   (drm_i915_private_t *)ring-dev-dev_private;
+
+   /* No command tables indicates a platform without parsing */
+   if (!ring-cmd_tables)
+   return 0;
+
+   /* XXX: VLV is Gen7 and therefore has cmd_tables, but has PPGTT
+* disabled. That will cause all of the parser's PPGTT checks to
+* fail. For now, disable parsing when PPGTT is off.
+*/
+   if(!dev_priv-mm.aliasing_ppgtt)
+   return 0;
+
+   return i915_enable_cmd_parser;
+}
+
 #define LENGTH_BIAS 2
 
 int i915_parse_cmds(struct intel_ring_buffer *ring,
@@ -590,17 +609,6 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
int ret = 0;
u32 *cmd, *batch_base, *batch_end;
struct drm_i915_cmd_descriptor default_desc = { 0 };
-   drm_i915_private_t *dev_priv =
-   (drm_i915_private_t *)ring-dev-dev_private;
-
-   /* XXX: this breaks VLV, which is Gen7, but no PPGTT
-* Replace with better checks for when to call i915_parse_cmds?
-*/
-   WARN_ON(!dev_priv-mm.aliasing_ppgtt);
-
-   /* No command tables currently indicates a platform without parsing */
-   if (!ring-cmd_tables)
-   return 0;
 
batch_base = vmap_batch(batch_obj);
if (!batch_base) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 161d9cd..e7fc31c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2412,6 +2412,7 @@ const char *i915_cache_level_str(int type);
 
 /* i915_cmd_parser.c */
 void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring);
+int i915_needs_cmd_parser(struct intel_ring_buffer *ring);
 int i915_parse_cmds(struct intel_ring_buffer *ring,
struct drm_i915_gem_object *batch_obj,
u32 batch_start_offset);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 06975c7..7b1453e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1143,7 +1143,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
}
batch_obj-base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND;
 
-   if (i915_enable_cmd_parser  !(flags  I915_DISPATCH_SECURE)) {
+   if (i915_needs_cmd_parser(ring)  !(flags  I915_DISPATCH_SECURE)) {
ret = i915_parse_cmds(ring,
  batch_obj,
  args-batch_start_offset);
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 11/22] drm/i915: Add register whitelists for mesa

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

These registers are currently used by mesa for blitting and
for transform feedback extensions.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 33 +
 drivers/gpu/drm/i915/i915_reg.h|  9 +
 2 files changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 8ee4cda..1decff9 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -140,6 +140,34 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
{ blt_cmds, ARRAY_SIZE(blt_cmds) },
 };
 
+/* Register whitelists, sorted by increasing register offset.
+ *
+ * Some registers that userspace accesses are 64 bits. The register
+ * access commands only allow 32-bit accesses. Hence, we have to include
+ * entries for both halves of the 64-bit registers.
+ */
+
+static const u32 gen7_render_regs[] = {
+   CL_INVOCATION_COUNT,
+   CL_INVOCATION_COUNT + sizeof(u32),
+   GEN7_SO_NUM_PRIMS_WRITTEN(0),
+   GEN7_SO_NUM_PRIMS_WRITTEN(0) + sizeof(u32),
+   GEN7_SO_NUM_PRIMS_WRITTEN(1),
+   GEN7_SO_NUM_PRIMS_WRITTEN(1) + sizeof(u32),
+   GEN7_SO_NUM_PRIMS_WRITTEN(2),
+   GEN7_SO_NUM_PRIMS_WRITTEN(2) + sizeof(u32),
+   GEN7_SO_NUM_PRIMS_WRITTEN(3),
+   GEN7_SO_NUM_PRIMS_WRITTEN(3) + sizeof(u32),
+   GEN7_SO_WRITE_OFFSET(0),
+   GEN7_SO_WRITE_OFFSET(1),
+   GEN7_SO_WRITE_OFFSET(2),
+   GEN7_SO_WRITE_OFFSET(3),
+};
+
+static const u32 gen7_blt_regs[] = {
+   BCS_SWCTRL,
+};
+
 #define CLIENT_MASK  0xE000
 #define SUBCLIENT_MASK   0x1800
 #define MI_CLIENT0x
@@ -212,6 +240,9 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
}
 
+   ring-reg_table = gen7_render_regs;
+   ring-reg_count = ARRAY_SIZE(gen7_render_regs);
+
ring-get_cmd_length_mask = gen7_render_get_cmd_length_mask;
break;
case VCS:
@@ -222,6 +253,8 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
case BCS:
ring-cmd_tables = gen7_blt_cmds;
ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+   ring-reg_table = gen7_blt_regs;
+   ring-reg_count = ARRAY_SIZE(gen7_blt_regs);
ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
break;
case VECS:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5d52569..aa43624 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -371,6 +371,15 @@
 #define SRC_COPY_BLT  ((0x229)|(0x4322))
 
 /*
+ * Registers used only by the command parser
+ */
+#define BCS_SWCTRL 0x22200
+
+#define CL_INVOCATION_COUNT 0x2338
+/* There are the 4 64-bit counter registers, one for each stream output */
+#define GEN7_SO_NUM_PRIMS_WRITTEN(n) (0x5200 + (n) * 8)
+
+/*
  * Reset registers
  */
 #define DEBUG_RESET_I830   0x6070
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 16/22] drm/i915: Reject additional commands

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

MI_USER_INTERRUPT: We're rejecting other interrupt-generating mechanisms

MI_LOAD_SCAN_LINES_*: The DDX is the only user of these and protects
them behind the I915_EXEC_SECURE flag, so we probably shouldn't let others
use these.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 26 +++---
 drivers/gpu/drm/i915/i915_reg.h| 29 +++--
 2 files changed, 38 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index f32dc69..3ad2a1e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -55,7 +55,7 @@
  -- */
 static const struct drm_i915_cmd_descriptor common_cmds[] = {
CMD(  MI_NOOP,  SMI,F,  1,  S  ),
-   CMD(  MI_USER_INTERRUPT,SMI,F,  1,  S  ),
+   CMD(  MI_USER_INTERRUPT,SMI,F,  1,  R  ),
CMD(  MI_WAIT_FOR_EVENT,SMI,F,  1,  S  ),
CMD(  MI_ARB_CHECK, SMI,F,  1,  S  ),
CMD(  MI_REPORT_HEAD,   SMI,F,  1,  S  ),
@@ -145,6 +145,8 @@ static const struct drm_i915_cmd_descriptor 
hsw_render_cmds[] = {
CMD(  MI_RS_CONTROL,SMI,F,  1,  S  ),
CMD(  MI_URB_ATOMIC_ALLOC,  SMI,F,  1,  S  ),
CMD(  MI_RS_CONTEXT,SMI,F,  1,  S  ),
+   CMD(  MI_LOAD_SCAN_LINES_INCL,  SMI,   !F,  0x3F,   R  ),
+   CMD(  MI_LOAD_SCAN_LINES_EXCL,  SMI,   !F,  0x3F,   R  ),
CMD(  MI_MATH,  SMI,   !F,  0x3F,   S  ),
CMD(  MI_LOAD_REGISTER_REG, SMI,   !F,  0xFF,   W,
  .reg = { .offset = 1, .mask = 0x007C }   ),
@@ -245,6 +247,11 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
CMD(  SRC_COPY_BLT, S2D,   !F,  0x3F,   S  ),
 };
 
+static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = {
+   CMD(  MI_LOAD_SCAN_LINES_INCL,  SMI,   !F,  0x3F,   R  ),
+   CMD(  MI_LOAD_SCAN_LINES_EXCL,  SMI,   !F,  0x3F,   R  ),
+};
+
 #undef CMD
 #undef SMI
 #undef S3D
@@ -282,6 +289,12 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
{ blt_cmds, ARRAY_SIZE(blt_cmds) },
 };
 
+static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = {
+   { common_cmds, ARRAY_SIZE(common_cmds) },
+   { blt_cmds, ARRAY_SIZE(blt_cmds) },
+   { hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) },
+};
+
 /* Register whitelists, sorted by increasing register offset.
  *
  * Some registers that userspace accesses are 64 bits. The register
@@ -393,10 +406,17 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
ring-get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
case BCS:
-   ring-cmd_tables = gen7_blt_cmds;
-   ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+   if (IS_HASWELL(ring-dev)) {
+   ring-cmd_tables = hsw_blt_ring_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds);
+   } else {
+   ring-cmd_tables = gen7_blt_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+   }
+
ring-reg_table = gen7_blt_regs;
ring-reg_count = ARRAY_SIZE(gen7_blt_regs);
+
ring-get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
break;
case VECS:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 919d1a6..232ad0c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -345,20 +345,21 @@
 /*
  * Commands used only by the command parser
  */
-#define MI_SET_PREDICATE   MI_INSTR(0x01, 0)
-#define MI_ARB_CHECK   MI_INSTR(0x05, 0)
-#define MI_RS_CONTROL  MI_INSTR(0x06, 0)
-#define MI_URB_ATOMIC_ALLOCMI_INSTR(0x09, 0)
-#define MI_PREDICATE   MI_INSTR(0x0C, 0)
-#define MI_RS_CONTEXT  MI_INSTR(0x0F, 0)
-#define MI_TOPOLOGY_FILTER MI_INSTR(0x0D, 0)
-#define MI_MATHMI_INSTR(0x1A, 0)
-#define MI_UPDATE_GTT  MI_INSTR(0x23, 0)
-#define MI_CLFLUSH MI_INSTR(0x27, 0)
-#define MI_LOAD_REGISTER_MEM   MI_INSTR(0x29, 0)
-#define MI_LOAD_REGISTER_REG   MI_INSTR(0x2A, 0)
-#define MI_LOAD_URB_MEMMI_INSTR(0x2C, 0)
-#define MI_STORE_URB_MEM   MI_INSTR(0x2D, 0)
+#define MI_SET_PREDICATEMI_INSTR(0x01, 0)
+#define MI_ARB_CHECKMI_INSTR(0x05, 0)
+#define MI_RS_CONTROL   MI_INSTR(0x06, 0)
+#define MI_URB_ATOMIC_ALLOC MI_INSTR(0x09, 0)
+#define

[Intel-gfx] [RFC 12/22] drm/i915: Enable register whitelist checks

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

MI_STORE_REGISTER_MEM and MI_LOAD_REGISTER_* commands allow userspace
access to registers. Only certain registers should be allowed for such
access, so enable checking for those commands.

OTC-Tracker: AXIA-4631
Change-Id: Ie614a2f0eb2e5917de809e5a17957175d24cc44f
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 1decff9..df5424b 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -48,6 +48,7 @@
 #define F CMD_DESC_FIXED
 #define S CMD_DESC_SKIP
 #define R CMD_DESC_REJECT
+#define W CMD_DESC_REGISTER
 
 /*Command  Mask   Fixed Len   Action
  -- */
@@ -61,10 +62,13 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = 
{
CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   R  ),
CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  S  ),
CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   R  ),
-   CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   W,
+ .reg = { .offset = 1, .mask = 0x007C }   ),
CMD(  MI_UPDATE_GTT,SMI,   !F,  0xFF,   R  ),
-   CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   S  ),
-   CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   W,
+ .reg = { .offset = 1, .mask = 0x007C }   ),
+   CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   W,
+ .reg = { .offset = 1, .mask = 0x007C }   ),
CMD(  MI_BATCH_BUFFER_START,SMI,   !F,  0xFF,   S  ),
 };
 
@@ -88,7 +92,8 @@ static const struct drm_i915_cmd_descriptor hsw_render_cmds[] 
= {
CMD(  MI_URB_ATOMIC_ALLOC,  SMI,F,  1,  S  ),
CMD(  MI_RS_CONTEXT,SMI,F,  1,  S  ),
CMD(  MI_MATH,  SMI,   !F,  0x3F,   S  ),
-   CMD(  MI_LOAD_REGISTER_REG, SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_LOAD_REGISTER_REG, SMI,   !F,  0xFF,   W,
+ .reg = { .offset = 1, .mask = 0x007C }   ),
CMD(  MI_LOAD_URB_MEM,  SMI,   !F,  0xFF,   S  ),
CMD(  MI_STORE_URB_MEM, SMI,   !F,  0xFF,   S  ),
CMD(  GFX_OP_3DSTATE_DX9_CONSTANTF_VS,  S3D,   !F,  0x7FF,  S  ),
@@ -114,6 +119,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
 #undef F
 #undef S
 #undef R
+#undef W
 
 static const struct drm_i915_cmd_table gen7_render_cmds[] = {
{ common_cmds, ARRAY_SIZE(common_cmds) },
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Certain OpenGL features (e.g. transform feedback, performance monitoring)
require userspace code to submit batches containing commands such as
MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
generations of the hardware will noop these commands in unsecure batches
(which includes all userspace batches submitted via i915) even though the
commands may be safe and represent the intended programming model of the device.

This series introduces a software command parser similar in operation to the
command parsing done in hardware for unsecure batches. However, the software
parser allows some operations that would be noop'd by hardware, if the parser
determines the operation is safe, and submits the batch as secure to prevent
hardware parsing. Currently the series implements this on IVB and HSW.

The series is divided into several phases:

patches 01-09: These implement infrastructure and the command parsing algorithm,
   all behind a module parameter. I expect some discussion and
   rework, but hopefully there's nothing too controversial.
patches 10-17: These define the checks performed by the parser.
   I expect much discussion :)
patches 18-20: In a final pass over the command checks, I found some issues with
   the definitions. They looked painful to rebase in, so I've added
   them here.
patches 21-22: These enable the parser by default. It runs on all batches except
   those that set the I915_EXEC_SECURE flag in the execbuffer2 call.

There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very
basic and do not test all of the commands used by the parser on the assumption
that I'm likely to make the same mistakes in both the parser and the test.

I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few
days ago), and generally used an Ubuntu 13.10 IVB system with the parser
running. Aside from a failure described below, I don't think there are any
regressions. That is, piglit claims some regressions, but from manually running
the tests I think these are false positives. However, I could use help in
getting broader testing, particularly around performance. In general, I see less
than 3% performance impact on HSW, with more like 10% impact for pathological
batch sizes. But we'll certainly want to run relevant benchmarks beyond what
I've done.

At this point there are a couple of known issues and potential improvements.

1) VLV. The parser is currently disabled for VLV. One type of check performed by
   the parser is that commands which access memory do so via PPGTT. VLV does not
   have PPGTT enabled at this time. I chose to implement the PPGTT checks via
   generic bit checking infrastructure in the parser, so they are not easily
   disabled for VLV. For now, I'm disabling parsing altogether in the hope that
   PPGTT can be enabled for VLV in the near future.
2) Coherency. I've found two types of coherency issues when reading the batch
   buffer from the CPU during execbuffer2. Looking for help with both issues.
i. First, the i-g-t test gem_cpu_reloc blits to a batch buffer and the
   parser isn't properly waiting for the operation to complete before
   parsing. I tried adding i915_gem_object_sync(batch_obj, [ring|NULL])
   but that actually caused more failures.
   ii. Second, on VLV, I've seen cache coherency issues when userspace writes
   the batch via pwrite fast path before calling execbuffer2. The parser
   reads stale data. This works fine on IVB and HSW, so I believe it's an
   LLC vs. non-LLC issue. I'm just unclear on what the correct flushing or
   synchronization is for this scenario.
3) 2nd-level batches. The parser currently allows MI_BATCH_BUFFER_START commands
   in userspace batches without parsing them. The media driver uses 2nd-level
   batches, so a solution is required. I have some ideas but don't want to delay
   the review process for what I have so far. It may be that the 2nd-level
   parsing is only needed for VCS and the current code (plus rejecting BBS)
   would be sufficient for RCS.
4) Command buffer copy. To avoid CPU modifications to buffers after parsing, and
   to avoid GPU modifications to buffers via EUs or commands in the batch, we
   should copy the userspace batch buffer to memory that userspace does not
   have access to, map it into GGTT, and execute that batch buffer. I have a
   sense of how to do this for 1st-level batches, but it would need changes to
   tie in with the 2nd-level batch parsing I think, so I've again held off.

Brad Volkin (22):
  drm/i915: Add data structures for command parser
  drm/i915: Initial command parser table definitions
  drm/i915: Hook command parser tables up to rings
  drm/i915: Add per-ring command length decode functions
  drm/i915: Implement command parsing
  drm/i915: Add a HAS_CMD_PARSER getparam
  drm/i915: Add support for

[Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

At this point the parser just implements the mechanics of finding
commands in the batch buffer and looking them up in the parser tables.

It rejects poorly formed batches (e.g. no MI_BATCH_BUFFER_END command,
containing unrecognized commands, etc) but does no other checks.

Optionally enabled by a module parameter, currently defaulting to off.
We'll enable by default at the end of the series.

The code to look up commands in the per-ring tables implements a linear
search. The tables are small so in practice this does not appear to cause
a performance issue. However, the tables are sorted by command opcode such
that we can move to something like binary search if this code becomes a
bottleneck in the future.

OTC-Tracker: AXIA-4631
Change-Id: If71f50d28578d27325c3438abf4b229c7cf695cd
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 148 +
 drivers/gpu/drm/i915/i915_drv.c|   5 +
 drivers/gpu/drm/i915/i915_drv.h|   4 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  15 +++
 4 files changed, 172 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 247d530..b01628e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -232,3 +232,151 @@ void i915_cmd_parser_init_ring(struct intel_ring_buffer 
*ring)
  ring-id);
}
 }
+
+static const struct drm_i915_cmd_descriptor*
+find_cmd_in_table(const struct drm_i915_cmd_table *table,
+ u32 cmd_header)
+{
+   int i;
+
+   for (i = 0; i  table-count; i++) {
+   const struct drm_i915_cmd_descriptor *desc = table-table[i];
+   u32 masked_cmd = desc-cmd.mask  cmd_header;
+   u32 masked_value = desc-cmd.value  desc-cmd.mask;
+
+   if (masked_cmd == masked_value)
+   return desc;
+   }
+
+   return NULL;
+}
+
+/* Returns a pointer to a descriptor for the command specified by cmd_header.
+ *
+ * The caller must supply space for a default descriptor via the default_desc
+ * parameter. If no descriptor for the specified command exists in the ring's
+ * command parser tables, this function fills in default_desc based on the
+ * ring's default length encoding and returns default_desc.
+ */
+static const struct drm_i915_cmd_descriptor*
+find_cmd(struct intel_ring_buffer *ring,
+u32 cmd_header,
+struct drm_i915_cmd_descriptor *default_desc)
+{
+   u32 mask;
+   int i;
+
+   for (i = 0; i  ring-cmd_table_count; i++) {
+   const struct drm_i915_cmd_descriptor *desc;
+
+   desc = find_cmd_in_table(ring-cmd_tables[i], cmd_header);
+   if (desc)
+   return desc;
+   }
+
+   mask = ring-get_cmd_length_mask(cmd_header);
+   if (!mask)
+   return NULL;
+
+   BUG_ON(!default_desc);
+   default_desc-flags = CMD_DESC_SKIP;
+   default_desc-length.mask = mask;
+
+   return default_desc;
+}
+
+static u32 *vmap_batch(struct drm_i915_gem_object *obj)
+{
+   int i;
+   void *addr = NULL;
+   struct sg_page_iter sg_iter;
+   struct page **pages;
+
+   pages = drm_malloc_ab(obj-base.size  PAGE_SHIFT, sizeof(*pages));
+   if (pages == NULL) {
+   DRM_DEBUG(Failed to get space for pages\n);
+   goto finish;
+   }
+
+   i = 0;
+   for_each_sg_page(obj-pages-sgl, sg_iter, obj-pages-nents, 0) {
+   pages[i] = sg_page_iter_page(sg_iter);
+   i++;
+   }
+
+   addr = vmap(pages, i, 0, PAGE_KERNEL);
+   if (addr == NULL) {
+   DRM_DEBUG(Failed to vmap pages\n);
+   goto finish;
+   }
+
+finish:
+   if (pages)
+   drm_free_large(pages);
+   return (u32*)addr;
+}
+
+#define LENGTH_BIAS 2
+
+int i915_parse_cmds(struct intel_ring_buffer *ring,
+   struct drm_i915_gem_object *batch_obj,
+   u32 batch_start_offset)
+{
+   int ret = 0;
+   u32 *cmd, *batch_base, *batch_end;
+   struct drm_i915_cmd_descriptor default_desc = { 0 };
+
+   /* No command tables currently indicates a platform without parsing */
+   if (!ring-cmd_tables)
+   return 0;
+
+   batch_base = vmap_batch(batch_obj);
+   if (!batch_base) {
+   DRM_DEBUG_DRIVER(CMD: Failed to vmap batch\n);
+   return -ENOMEM;
+   }
+
+   cmd = batch_base + (batch_start_offset / sizeof(*cmd));
+   batch_end = cmd + (batch_obj-base.size / sizeof(*batch_end));
+
+   while (cmd  batch_end) {
+   const struct drm_i915_cmd_descriptor *desc;
+   u32 length;
+
+   if (*cmd == MI_BATCH_BUFFER_END)
+   break;
+
+   desc = find_cmd(ring, *cmd,

[Intel-gfx] [RFC 02/22] drm/i915: Initial command parser table definitions

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Add command tables defining irregular length commands for each ring.
This requires a few new command opcode definitions.

OTC-Tracker: AXIA-4631
Change-Id: I064bceb457e15f46928058352afe76d918c58ef5
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/Makefile  |   3 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c | 138 +
 drivers/gpu/drm/i915/i915_reg.h|  34 
 3 files changed, 174 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_cmd_parser.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 41838ea..7b7450b 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -47,7 +47,8 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o \
  dvo_tfp410.o \
  dvo_sil164.o \
  dvo_ns2501.o \
- i915_gem_dmabuf.o
+ i915_gem_dmabuf.o \
+ i915_cmd_parser.o
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
new file mode 100644
index 000..deb77c8c
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Brad Volkin bradley.d.vol...@intel.com
+ *
+ */
+
+#include i915_drv.h
+
+#define STD_MI_OPCODE_MASK  0xFF80
+#define STD_3D_OPCODE_MASK  0x
+#define STD_2D_OPCODE_MASK  0xFFC0
+#define STD_MFX_OPCODE_MASK 0x
+
+#define CMD(op, opm, f, lm, fl, ...)   \
+   {   \
+   .flags = (fl) | (f),\
+   .cmd = { (op), (opm) }, \
+   .length = { (lm) }, \
+   __VA_ARGS__ \
+   }
+
+/* Convenience macros to compress the tables */
+#define SMI STD_MI_OPCODE_MASK
+#define S3D STD_3D_OPCODE_MASK
+#define S2D STD_2D_OPCODE_MASK
+#define SMFX STD_MFX_OPCODE_MASK
+#define F CMD_DESC_FIXED
+#define S CMD_DESC_SKIP
+
+/*Command  Mask   Fixed Len   Action
+ -- */
+static const struct drm_i915_cmd_descriptor common_cmds[] = {
+   CMD(  MI_NOOP,  SMI,F,  1,  S  ),
+   CMD(  MI_USER_INTERRUPT,SMI,F,  1,  S  ),
+   CMD(  MI_WAIT_FOR_EVENT,SMI,F,  1,  S  ),
+   CMD(  MI_ARB_CHECK, SMI,F,  1,  S  ),
+   CMD(  MI_REPORT_HEAD,   SMI,F,  1,  S  ),
+   CMD(  MI_SUSPEND_FLUSH, SMI,F,  1,  S  ),
+   CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  S  ),
+   CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_STORE_REGISTER_MEM(1), SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_LOAD_REGISTER_MEM, SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_BATCH_BUFFER_START,SMI,   !F,  0xFF,   S  ),
+};
+
+static const struct drm_i915_cmd_descriptor render_cmds[] = {
+   CMD(  MI_FLUSH, SMI,F,  1,  S  ),
+   CMD(  MI_ARB_ON_OFF,SMI,F,  1,  S  ),
+   CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   S  ),
+   CMD(  MI_PREDICATE, SMI,F,  1,  S  ),
+   CMD(  MI_TOPOLOGY_FILTER,   SMI,F,  1,  S  ),
+   CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  S  ),
+   CMD(  GFX_OP_3DSTATE_VF_STATISTICS, S3D,F,  1,  S  ),
+

[Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

The length mask is different for each ring and the size can vary,
so we should duplicate the definition with the correct encoding
for each ring.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 35 +++---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index adc7d94..8481ef0 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -61,13 +61,6 @@ static const struct drm_i915_cmd_descriptor common_cmds[] = {
CMD(  MI_REPORT_HEAD,   SMI,F,  1,  S  ),
CMD(  MI_SUSPEND_FLUSH, SMI,F,  1,  S  ),
CMD(  MI_SEMAPHORE_MBOX,SMI,   !F,  0xFF,   R  ),
-   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3FF,  B,
- .bits = {{
-   .offset = 0,
-   .mask = MI_GLOBAL_GTT,
-   .expected = 0
- }},
- .bits_count = 1  ),
CMD(  MI_STORE_DWORD_INDEX, SMI,   !F,  0xFF,   R  ),
CMD(  MI_LOAD_REGISTER_IMM(1),  SMI,   !F,  0xFF,   W,
  .reg = { .offset = 1, .mask = 0x007C }   ),
@@ -97,6 +90,13 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = {
CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
CMD(  MI_PREDICATE, SMI,F,  1,  S  ),
CMD(  MI_TOPOLOGY_FILTER,   SMI,F,  1,  S  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x3F,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  B,
  .bits = {{
.offset = 0,
@@ -165,6 +165,13 @@ static const struct drm_i915_cmd_descriptor 
hsw_render_cmds[] = {
 
 static const struct drm_i915_cmd_descriptor video_cmds[] = {
CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0xFF,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
  .bits = {{
.offset = 0,
@@ -202,6 +209,13 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = 
{
 
 static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0xFF,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
  .bits = {{
.offset = 0,
@@ -234,6 +248,13 @@ static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
 
 static const struct drm_i915_cmd_descriptor blt_cmds[] = {
CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
+   CMD(  MI_STORE_DWORD_IMM,   SMI,   !F,  0x1FF,  B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_GLOBAL_GTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
  .bits = {{
.offset = 0,
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 17/22] drm/i915: Add parser data for perf monitoring GL extensions

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

These registers and commands will be used by mesa for the
GL_AMD_performance_monitor extension.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 27 +++
 drivers/gpu/drm/i915/i915_reg.h| 13 +
 2 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 3ad2a1e..c8426af 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -104,6 +104,13 @@ static const struct drm_i915_cmd_descriptor render_cmds[] 
= {
.expected = 0
  }},
  .bits_count = 1  ),
+   CMD(  MI_REPORT_PERF_COUNT, SMI,   !F,  0x3F,   B,
+ .bits = {{
+   .offset = 1,
+   .mask = MI_REPORT_PERF_COUNT_GGTT,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MI_CONDITIONAL_BATCH_BUFFER_END,  SMI,   !F,  0xFF,   B,
  .bits = {{
.offset = 0,
@@ -303,8 +310,28 @@ static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] 
= {
  */
 
 static const u32 gen7_render_regs[] = {
+   HS_INVOCATION_COUNT,
+   HS_INVOCATION_COUNT + sizeof(u32),
+   DS_INVOCATION_COUNT,
+   DS_INVOCATION_COUNT + sizeof(u32),
+   IA_VERTICES_COUNT,
+   IA_VERTICES_COUNT + sizeof(u32),
+   IA_PRIMITIVES_COUNT,
+   IA_PRIMITIVES_COUNT + sizeof(u32),
+   VS_INVOCATION_COUNT,
+   VS_INVOCATION_COUNT + sizeof(u32),
+   GS_INVOCATION_COUNT,
+   GS_INVOCATION_COUNT + sizeof(u32),
+   GS_PRIMITIVES_COUNT,
+   GS_PRIMITIVES_COUNT + sizeof(u32),
CL_INVOCATION_COUNT,
CL_INVOCATION_COUNT + sizeof(u32),
+   CL_PRIMITIVES_COUNT,
+   CL_PRIMITIVES_COUNT + sizeof(u32),
+   PS_INVOCATION_COUNT,
+   PS_INVOCATION_COUNT + sizeof(u32),
+   PS_DEPTH_COUNT,
+   PS_DEPTH_COUNT + sizeof(u32),
GEN7_SO_NUM_PRIMS_WRITTEN(0),
GEN7_SO_NUM_PRIMS_WRITTEN(0) + sizeof(u32),
GEN7_SO_NUM_PRIMS_WRITTEN(1),
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 232ad0c..4dd5541 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -356,6 +356,8 @@
 #define MI_MATH MI_INSTR(0x1A, 0)
 #define MI_UPDATE_GTT   MI_INSTR(0x23, 0)
 #define MI_CLFLUSH  MI_INSTR(0x27, 0)
+#define MI_REPORT_PERF_COUNTMI_INSTR(0x28, 0)
+#define   MI_REPORT_PERF_COUNT_GGTT (10)
 #define MI_LOAD_REGISTER_MEMMI_INSTR(0x29, 0)
 #define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 0)
 #define MI_LOAD_URB_MEM MI_INSTR(0x2C, 0)
@@ -385,7 +387,18 @@
  */
 #define BCS_SWCTRL 0x22200
 
+#define HS_INVOCATION_COUNT 0x2300
+#define DS_INVOCATION_COUNT 0x2308
+#define IA_VERTICES_COUNT   0x2310
+#define IA_PRIMITIVES_COUNT 0x2318
+#define VS_INVOCATION_COUNT 0x2320
+#define GS_INVOCATION_COUNT 0x2328
+#define GS_PRIMITIVES_COUNT 0x2330
 #define CL_INVOCATION_COUNT 0x2338
+#define CL_PRIMITIVES_COUNT 0x2340
+#define PS_INVOCATION_COUNT 0x2348
+#define PS_DEPTH_COUNT  0x2350
+
 /* There are the 4 64-bit counter registers, one for each stream output */
 #define GEN7_SO_NUM_PRIMS_WRITTEN(n) (0x5200 + (n) * 8)
 
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 08/22] drm/i915: Add support for checking register accesses

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Some OpenGL/media features require userspace to perform register
accesses from batch buffers on Gen hardware.

To enable this, each ring gets a whitelist of registers that userspace
may access from a batch buffer. With this patch, no whitelists are defined,
so no access is allowed.

OTC-Tracker: AXIA-4631
Change-Id: Ibf607ec3b1df0076f6acbd9aee34a2ee48cfac6b
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 26 ++
 drivers/gpu/drm/i915/i915_drv.h | 19 ---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  6 ++
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index c64f640..2dbca01 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -285,6 +285,20 @@ find_cmd(struct intel_ring_buffer *ring,
return default_desc;
 }
 
+static int valid_reg(const u32 *table, int count, u32 addr)
+{
+   if (table  count != 0) {
+   int i;
+
+   for (i = 0; i  count; i++) {
+   if (table[i] == addr)
+   return 1;
+   }
+   }
+
+   return 0;
+}
+
 static u32 *vmap_batch(struct drm_i915_gem_object *obj)
 {
int i;
@@ -374,6 +388,18 @@ int i915_parse_cmds(struct intel_ring_buffer *ring,
break;
}
 
+   if (desc-flags  CMD_DESC_REGISTER) {
+   u32 reg_addr = cmd[desc-reg.offset]  desc-reg.mask;
+
+   if (!valid_reg(ring-reg_table,
+  ring-reg_count, reg_addr)) {
+   DRM_DEBUG_DRIVER(CMD: Rejected register 0x%08X 
in command: 0x%08X (ring=%d)\n,
+reg_addr, *cmd, ring-id);
+   ret = -EINVAL;
+   break;
+   }
+   }
+
cmd += length;
}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6ace856..83b6031 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1744,11 +1744,14 @@ struct drm_i915_cmd_descriptor {
 *standard length encoding for the opcode range in
 *which it falls
 * CMD_DESC_REJECT: The command is never allowed
+* CMD_DESC_REGISTER: The command should be checked against the
+*register whitelist for the appropriate ring
 */
u32 flags;
-#define CMD_DESC_FIXED  (10)
-#define CMD_DESC_SKIP   (11)
-#define CMD_DESC_REJECT (12)
+#define CMD_DESC_FIXED(10)
+#define CMD_DESC_SKIP (11)
+#define CMD_DESC_REJECT   (12)
+#define CMD_DESC_REGISTER (13)
 
/**
 * The command's unique identification bits and the bitmask to get them.
@@ -1771,6 +1774,16 @@ struct drm_i915_cmd_descriptor {
u32 fixed;
u32 mask;
} length;
+
+   /**
+* Describes where to find a register address in the command to check
+* against the ring's register whitelist. Only valid if flags has the
+* CMD_DESC_REGISTER bit set.
+*/
+   struct {
+   u32 offset;
+   u32 mask;
+   } reg;
 };
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 8e71b59..b898105a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -171,6 +171,12 @@ struct  intel_ring_buffer {
int cmd_table_count;
 
/**
+* Table of registers allowed in commands that read/write registers.
+*/
+   const u32 *reg_table;
+   int reg_count;
+
+   /**
 * Returns the bitmask for the length field of the specified command.
 * Return 0 for an unrecognized/invalid command.
 *
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 13/22] drm/i915: Enable bit checking for some commands

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

These checks prevent userspace from using certain commands to
access registers or generate interrupts.

OTC-Tracker: AXIA-4631
Change-Id: Ic6367ae98272495ba874c22abd4824fbced0abca
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 41 ++
 drivers/gpu/drm/i915/i915_reg.h|  3 +++
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index df5424b..b881d39 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -49,6 +49,7 @@
 #define S CMD_DESC_SKIP
 #define R CMD_DESC_REJECT
 #define W CMD_DESC_REGISTER
+#define B CMD_DESC_BITMASK
 
 /*Command  Mask   Fixed Len   Action
  -- */
@@ -81,9 +82,23 @@ static const struct drm_i915_cmd_descriptor render_cmds[] = {
CMD(  MI_CLFLUSH,   SMI,   !F,  0x3FF,  S  ),
CMD(  GFX_OP_3DSTATE_VF_STATISTICS, S3D,F,  1,  S  ),
CMD(  PIPELINE_SELECT,  S3D,F,  1,  S  ),
+   CMD(  MEDIA_VFE_STATE,  S3D,   !F,  0x, B,
+ .bits = {{
+   .offset = 2,
+   .mask = MEDIA_VFE_STATE_MMIO_ACCESS_MASK,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  GPGPU_OBJECT, S3D,   !F,  0xFF,   S  ),
CMD(  GPGPU_WALKER, S3D,   !F,  0xFF,   S  ),
CMD(  GFX_OP_3DSTATE_SO_DECL_LIST,  S3D,   !F,  0x1FF,  S  ),
+   CMD(  GFX_OP_PIPE_CONTROL(5),   S3D,   !F,  0xFF,   B,
+ .bits = {{
+   .offset = 1,
+   .mask = (PIPE_CONTROL_MMIO_WRITE | PIPE_CONTROL_NOTIFY),
+   .expected = 0
+ }},
+ .bits_count = 1  ),
 };
 
 static const struct drm_i915_cmd_descriptor hsw_render_cmds[] = {
@@ -102,11 +117,35 @@ static const struct drm_i915_cmd_descriptor 
hsw_render_cmds[] = {
 
 static const struct drm_i915_cmd_descriptor video_cmds[] = {
CMD(  MI_ARB_ON_OFF,SMI,F,  1,  R  ),
+   CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_FLUSH_DW_NOTIFY,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  MFX_WAIT, SMFX,  !F,  0x3F,   S  ),
 };
 
+static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
+   CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_FLUSH_DW_NOTIFY,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
+};
+
 static const struct drm_i915_cmd_descriptor blt_cmds[] = {
CMD(  MI_DISPLAY_FLIP,  SMI,   !F,  0xFF,   R  ),
+   CMD(  MI_FLUSH_DW,  SMI,   !F,  0x3F,   B,
+ .bits = {{
+   .offset = 0,
+   .mask = MI_FLUSH_DW_NOTIFY,
+   .expected = 0
+ }},
+ .bits_count = 1  ),
CMD(  COLOR_BLT,S2D,   !F,  0x3F,   S  ),
CMD(  SRC_COPY_BLT, S2D,   !F,  0x3F,   S  ),
 };
@@ -120,6 +159,7 @@ static const struct drm_i915_cmd_descriptor blt_cmds[] = {
 #undef S
 #undef R
 #undef W
+#undef B
 
 static const struct drm_i915_cmd_table gen7_render_cmds[] = {
{ common_cmds, ARRAY_SIZE(common_cmds) },
@@ -139,6 +179,7 @@ static const struct drm_i915_cmd_table gen7_video_cmds[] = {
 
 static const struct drm_i915_cmd_table hsw_vebox_cmds[] = {
{ common_cmds, ARRAY_SIZE(common_cmds) },
+   { vecs_cmds, ARRAY_SIZE(vecs_cmds) },
 };
 
 static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index aa43624..0e504b9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -240,6 +240,7 @@
 #define   MI_FLUSH_DW_STORE_INDEX  (121)
 #define   MI_INVALIDATE_TLB(118)
 #define   MI_FLUSH_DW_OP_STOREDW   (114)
+#define   MI_FLUSH_DW_NOTIFY   (18)
 #define   MI_INVALIDATE_BSD(17)
 #define   MI_FLUSH_DW_USE_GTT  (12)
 #define   MI_FLUSH_DW_USE_PPGTT(02)
@@ -318,6 +319,7 @@
 #define   DISPLAY_PLANE_B   (120)
 #define GFX_OP_PIPE_CONTROL(len)

[Intel-gfx] [RFC 03/22] drm/i915: Hook command parser tables up to rings

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

OTC-Tracker: AXIA-4631
Change-Id: Id178f67338d00c23ca4e8fc9499313dba93d2c5c
Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c  | 34 +
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  7 +++
 4 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index deb77c8c..014e661 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -136,3 +136,37 @@ static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
{ common_cmds, ARRAY_SIZE(common_cmds) },
{ blt_cmds, ARRAY_SIZE(blt_cmds) },
 };
+
+void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring)
+{
+   if (!IS_GEN7(ring-dev))
+   return;
+
+   switch (ring-id) {
+   case RCS:
+   if (IS_HASWELL(ring-dev)) {
+   ring-cmd_tables = hsw_render_ring_cmds;
+   ring-cmd_table_count =
+   ARRAY_SIZE(hsw_render_ring_cmds);
+   } else {
+   ring-cmd_tables = gen7_render_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
+   }
+   break;
+   case VCS:
+   ring-cmd_tables = gen7_video_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(gen7_video_cmds);
+   break;
+   case BCS:
+   ring-cmd_tables = gen7_blt_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+   break;
+   case VECS:
+   ring-cmd_tables = hsw_vebox_cmds;
+   ring-cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds);
+   break;
+   default:
+   DRM_DEBUG(CMD: cmd_parser_init with unknown ring: %d\n,
+ ring-id);
+   }
+}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ff1e201..7de4e59 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2372,6 +2372,9 @@ void i915_destroy_error_state(struct drm_device *dev);
 void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone);
 const char *i915_cache_level_str(int type);
 
+/* i915_cmd_parser.c */
+void i915_cmd_parser_init_ring(struct intel_ring_buffer *ring);
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_device *dev);
 extern int i915_restore_state(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 69589e4..553b889 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1382,6 +1382,8 @@ static int intel_init_ring_buffer(struct drm_device *dev,
if (IS_I830(ring-dev) || IS_845G(ring-dev))
ring-effective_size -= 128;
 
+   i915_cmd_parser_init_ring(ring);
+
return 0;
 
 err_unmap:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 71a73f4..67305d3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -162,6 +162,13 @@ struct  intel_ring_buffer {
u32 gtt_offset;
volatile u32 *cpu_page;
} scratch;
+
+   /**
+* Tables of commands the command parser needs to know about
+* for this ring.
+*/
+   const struct drm_i915_cmd_table *cmd_tables;
+   int cmd_table_count;
 };
 
 static inline bool
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] intel: Add HAS_CMD_PARSER parameter definition

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 include/drm/i915_drm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index c1914d6..525dc27 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -306,6 +306,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_ALIASING_PPGTT   18
 #define I915_PARAM_HAS_WAIT_TIMEOUT 19
 #define I915_PARAM_HAS_VEBOX22
+#define I915_PARAM_HAS_CMD_PARSER   28
 
 typedef struct drm_i915_getparam {
int param;
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/4] tests: Add a test for the command parser

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Start with a simple testcase that should pass.

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 tests/.gitignore   |   1 +
 tests/Makefile.sources |   1 +
 tests/gem_exec_parse.c | 140 +
 3 files changed, 142 insertions(+)
 create mode 100644 tests/gem_exec_parse.c

diff --git a/tests/.gitignore b/tests/.gitignore
index e835a5a..ec7f526 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -35,6 +35,7 @@ gem_exec_blt
 gem_exec_faulting_reloc
 gem_exec_lut_handle
 gem_exec_nop
+gem_exec_parse
 gem_fenced_exec_thrash
 gem_fence_thrash
 gem_flink
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index a02b93d..c90e5aa 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -25,6 +25,7 @@ TESTS_progs_M = \
gem_evict_everything \
gem_exec_bad_domains \
gem_exec_nop \
+   gem_exec_parse \
gem_fenced_exec_thrash \
gem_fence_thrash \
gem_flink \
diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
new file mode 100644
index 000..a17929f
--- /dev/null
+++ b/tests/gem_exec_parse.c
@@ -0,0 +1,140 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include stdlib.h
+#include stdint.h
+#include stdio.h
+#include drm.h
+#include i915_drm.h
+#include drmtest.h
+
+#ifndef I915_PARAM_HAS_CMD_PARSER
+#define I915_PARAM_HAS_CMD_PARSER   28
+#endif
+
+static int exec_batch_patched(int fd, uint32_t cmd_bo, uint32_t *cmds,
+ int size, int patch_offset, uint64_t 
expected_value)
+{
+   struct drm_i915_gem_execbuffer2 execbuf;
+   struct drm_i915_gem_exec_object2 objs[2];
+   struct drm_i915_gem_relocation_entry reloc[1];
+
+   uint32_t target_bo = gem_create(fd, 4096);
+   uint64_t actual_value = 0;
+
+   gem_write(fd, cmd_bo, 0, cmds, size);
+
+   reloc[0].offset = patch_offset;
+   reloc[0].delta = 0;
+   reloc[0].target_handle = target_bo;
+   reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
+   reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
+   reloc[0].presumed_offset = 0;
+
+   objs[0].handle = target_bo;
+   objs[0].relocation_count = 0;
+   objs[0].relocs_ptr = 0;
+   objs[0].alignment = 0;
+   objs[0].offset = 0;
+   objs[0].flags = 0;
+   objs[0].rsvd1 = 0;
+   objs[0].rsvd2 = 0;
+
+   objs[1].handle = cmd_bo;
+   objs[1].relocation_count = 1;
+   objs[1].relocs_ptr = (uintptr_t)reloc;
+   objs[1].alignment = 0;
+   objs[1].offset = 0;
+   objs[1].flags = 0;
+   objs[1].rsvd1 = 0;
+   objs[1].rsvd2 = 0;
+
+   execbuf.buffers_ptr = (uintptr_t)objs;
+   execbuf.buffer_count = 2;
+   execbuf.batch_start_offset = 0;
+   execbuf.batch_len = size;
+   execbuf.cliprects_ptr = 0;
+   execbuf.num_cliprects = 0;
+   execbuf.DR1 = 0;
+   execbuf.DR4 = 0;
+   execbuf.flags = I915_EXEC_RENDER;
+   i915_execbuffer2_set_context_id(execbuf, 0);
+   execbuf.rsvd2 = 0;
+
+   gem_execbuf(fd, execbuf);
+   gem_sync(fd, cmd_bo);
+
+   gem_read(fd,target_bo, 0, actual_value, sizeof(actual_value));
+   igt_assert(expected_value == actual_value);
+
+   gem_close(fd, target_bo);
+
+   return 1;
+}
+
+uint32_t handle;
+int fd;
+
+#define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2)
+#define   PIPE_CONTROL_QW_WRITE(114)
+
+igt_main
+{
+   igt_fixture {
+   int has_secparser = 0;
+drm_i915_getparam_t gp;
+   int rc;
+
+   fd = drm_open_any();
+
+   gp.param = I915_PARAM_HAS_CMD_PARSER;
+   gp.value = has_secparser;
+   rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp);
+   igt_require(!rc  has_secparser);
+
+

[Intel-gfx] [PATCH 2/4] tests/gem_exec_parse: Add tests for rejected commands

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 tests/gem_exec_parse.c | 81 ++
 1 file changed, 81 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index a17929f..b34fe1b 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -93,9 +93,55 @@ static int exec_batch_patched(int fd, uint32_t cmd_bo, 
uint32_t *cmds,
return 1;
 }
 
+static int exec_batch(int fd, uint32_t cmd_bo, uint32_t *cmds,
+ int size, int ring, int expected_ret)
+{
+   struct drm_i915_gem_execbuffer2 execbuf;
+   struct drm_i915_gem_exec_object2 objs[1];
+   int ret;
+
+   gem_write(fd, cmd_bo, 0, cmds, size);
+
+   objs[0].handle = cmd_bo;
+   objs[0].relocation_count = 0;
+   objs[0].relocs_ptr = 0;
+   objs[0].alignment = 0;
+   objs[0].offset = 0;
+   objs[0].flags = 0;
+   objs[0].rsvd1 = 0;
+   objs[0].rsvd2 = 0;
+
+   execbuf.buffers_ptr = (uintptr_t)objs;
+   execbuf.buffer_count = 1;
+   execbuf.batch_start_offset = 0;
+   execbuf.batch_len = size;
+   execbuf.cliprects_ptr = 0;
+   execbuf.num_cliprects = 0;
+   execbuf.DR1 = 0;
+   execbuf.DR4 = 0;
+   execbuf.flags = ring;
+   i915_execbuffer2_set_context_id(execbuf, 0);
+   execbuf.rsvd2 = 0;
+
+   ret = drmIoctl(fd,
+  DRM_IOCTL_I915_GEM_EXECBUFFER2,
+  execbuf);
+   if (ret == 0)
+   igt_assert(expected_ret == 0);
+   else
+   igt_assert(-errno == expected_ret);
+
+   gem_sync(fd, cmd_bo);
+
+   return 1;
+}
+
 uint32_t handle;
 int fd;
 
+#define MI_ARB_ON_OFF (0x8  23)
+#define MI_DISPLAY_FLIP ((0x14  23) | 1)
+
 #define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2)
 #define   PIPE_CONTROL_QW_WRITE(114)
 
@@ -132,6 +178,41 @@ igt_main
   0x1200));
}
 
+   igt_subtest(basic-rejected) {
+   uint32_t arb_on_off[] = {
+   MI_ARB_ON_OFF,
+   MI_BATCH_BUFFER_END,
+   };
+   uint32_t display_flip[] = {
+   MI_DISPLAY_FLIP,
+   0, 0, 0,
+   MI_BATCH_BUFFER_END,
+   0
+   };
+   igt_assert(
+  exec_batch(fd, handle,
+ arb_on_off, sizeof(arb_on_off),
+ I915_EXEC_RENDER,
+ -EINVAL));
+   igt_assert(
+  exec_batch(fd, handle,
+ arb_on_off, sizeof(arb_on_off),
+ I915_EXEC_BSD,
+ -EINVAL));
+   if (gem_has_vebox(fd)) {
+   igt_assert(
+  exec_batch(fd, handle,
+ arb_on_off, sizeof(arb_on_off),
+ I915_EXEC_VEBOX,
+ -EINVAL));
+   }
+   igt_assert(
+  exec_batch(fd, handle,
+ display_flip, sizeof(display_flip),
+ I915_EXEC_BLT,
+ -EINVAL));
+   }
+
igt_fixture {
gem_close(fd, handle);
 
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 4/4] tests/gem_exec_parse: Add tests for bitmask checks

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 tests/gem_exec_parse.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c
index d4136db..0eb328c 100644
--- a/tests/gem_exec_parse.c
+++ b/tests/gem_exec_parse.c
@@ -145,6 +145,7 @@ int fd;
 
 #define GFX_OP_PIPE_CONTROL((0x329)|(0x327)|(0x224)|2)
 #define   PIPE_CONTROL_QW_WRITE(114)
+#define   PIPE_CONTROL_LRI_POST_OP (123)
 
 igt_main
 {
@@ -239,6 +240,23 @@ igt_main
  0));
}
 
+   igt_subtest(bitmasks) {
+   uint32_t pc[] = {
+   GFX_OP_PIPE_CONTROL,
+   (PIPE_CONTROL_QW_WRITE |
+PIPE_CONTROL_LRI_POST_OP),
+   0, // To be patched
+   0x1200,
+   0,
+   MI_BATCH_BUFFER_END,
+   };
+   igt_assert(
+  exec_batch(fd, handle,
+ pc, sizeof(pc),
+ I915_EXEC_RENDER,
+ -EINVAL));
+   }
+
igt_fixture {
gem_close(fd, handle);
 
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 19/22] drm/i915: Fix length handling for MFX_WAIT

2013-11-26 Thread bradley . d . volkin

From: Brad Volkin bradley.d.vol...@intel.com

Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5593740..adc7d94 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -193,7 +193,11 @@ static const struct drm_i915_cmd_descriptor video_cmds[] = 
{
.expected = 0
  }},
  .bits_count = 1  ),
-   CMD(  MFX_WAIT, SMFX,  !F,  0x3F,   S  ),
+   /* MFX_WAIT doesn't fit the way we handle length for most commands.
+* It has a length field but it uses a non-standard length bias.
+* It is always 1 dword though, so just treat it as fixed length.
+*/
+   CMD(  MFX_WAIT, SMFX,   F,  1,  S  ),
 };
 
 static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
-- 
1.8.4.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: take mode config lock around crtc disable at suspend

2013-11-26 Thread Jesse Barnes

This is just a theoretical issue, but we need to do this to prevent the
WARN in pipe_from_connector at suspend time.

https://bugs.freedesktop.org/show_bug.cgi?id=71978
Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/gpu/drm/i915/i915_drv.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 13076db..0ec0fb3 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -535,8 +535,10 @@ static int i915_drm_freeze(struct drm_device *dev)
 * Disable CRTCs directly since we want to preserve sw state
 * for _thaw.
 */
+   mutex_lock(dev-mode_config.mutex);
list_for_each_entry(crtc, dev-mode_config.crtc_list, head)
dev_priv-display.crtc_disable(crtc);
+   mutex_unlock(dev-mode_config.mutex);
 
intel_modeset_suspend_hw(dev);
}
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2

2013-11-26 Thread Ville Syrjälä

On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote:
 And move it up in the file for earlier usage.
 
 v2: add pre-gen4 support as well (Chris)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/intel_display.c | 53 
 ++--
  1 file changed, 38 insertions(+), 15 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index e85d838..321d751 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc,
   pipe_config-port_clock = clock.dot / 5;
  }
  
 +static u32
 +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int 
 width,
 +   int bpp, bool tiled)
 +{
 + u32 pitch = DIV_ROUND_UP(width * bpp, 8);
 + int align;
 +
 + if (tiled) {
 + if (INTEL_INFO(dev_priv-dev)-gen  4) {
 + /* Pre-965 needs power of two tile width */
 + for (align = 512; align  pitch; align = 1)
 + ;
 + } else {
 + align = 512;
 + }

Gen2 tiles are 128 bytes wide, not 512 bytes.

So maybe something like this:
 if (IS_GEN2()) {
   return roundup_power_of_two(max(pitch, 128));
 else if (IS_GEN3())
   return roundup_power_of_two(max(pitch, 512));
 else
   return ALIGN(pitch, 512);

 + } else {
 + align = 64;

Also I just noticed in the docs that gen2/3 only seem to require 32byte
alignment for linear buffers. But relaxing that would require a change
to intel_framebuffer_init() as well, so it looks like material for
another patch. Also would need to be tested on real hardware.

 + }
 +
 + return ALIGN(pitch, align);
 +}
 +
  static bool i9xx_get_pipe_config(struct intel_crtc *crtc,
struct intel_crtc_config *pipe_config)
  {
 @@ -7652,16 +7674,12 @@ err:
  }
  
  static u32
 -intel_framebuffer_pitch_for_width(int width, int bpp)
 -{
 - u32 pitch = DIV_ROUND_UP(width * bpp, 8);
 - return ALIGN(pitch, 64);
 -}
 -
 -static u32
 -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp)
 +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv,
 + struct drm_display_mode *mode, int bpp)
  {
 - u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp);
 + u32 pitch = intel_framebuffer_pitch_for_width(dev_priv,
 +   mode-hdisplay, bpp,
 +   false);
   return ALIGN(pitch * mode-vdisplay, PAGE_SIZE);

This should also align the fb height to tile height, otherwise
intel_framebuffer_init() might reject it.

  }
  
 @@ -7670,18 +7688,21 @@ intel_framebuffer_create_for_mode(struct drm_device 
 *dev,
 struct drm_display_mode *mode,
 int depth, int bpp)
  {
 + struct drm_i915_private *dev_priv = dev-dev_private;
   struct drm_i915_gem_object *obj;
   struct drm_mode_fb_cmd2 mode_cmd = { 0 };
 + int size;
  
 - obj = i915_gem_alloc_object(dev,
 - intel_framebuffer_size_for_mode(mode, bpp));
 + size =  intel_framebuffer_size_for_mode(dev_priv, mode, bpp);

 + obj = i915_gem_alloc_object(dev, size);
   if (obj == NULL)
   return ERR_PTR(-ENOMEM);
  
   mode_cmd.width = mode-hdisplay;
   mode_cmd.height = mode-vdisplay;
 - mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(mode_cmd.width,
 - bpp);
 + mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(dev_priv,
 + mode_cmd.width,
 + bpp, false);
   mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth);
  
   return intel_framebuffer_create(dev, mode_cmd, obj);
 @@ -7704,8 +7725,10 @@ mode_fits_in_fbdev(struct drm_device *dev,
   return NULL;
  
   fb = dev_priv-fbdev-ifb.base;
 - if (fb-pitches[0]  intel_framebuffer_pitch_for_width(mode-hdisplay,
 -
 fb-bits_per_pixel))
 + if (fb-pitches[0]  intel_framebuffer_pitch_for_width(dev_priv,
 +mode-hdisplay,
 +
 fb-bits_per_pixel,
 +false))
   return NULL;
  
   if (obj-base.size  mode-vdisplay * fb-pitches[0])
 -- 
 1.8.4.2
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--

Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Chris Wilson

On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
 +static const struct drm_i915_cmd_descriptor*
 +find_cmd_in_table(const struct drm_i915_cmd_table *table,
 +   u32 cmd_header)
 +{
 + int i;
 +
 + for (i = 0; i  table-count; i++) {
 + const struct drm_i915_cmd_descriptor *desc = table-table[i];
 + u32 masked_cmd = desc-cmd.mask  cmd_header;
 + u32 masked_value = desc-cmd.value  desc-cmd.mask;
 +
 + if (masked_cmd == masked_value)
 + return desc;

Maybe pre-sort the cmd table and use bsearch? And a runtime test on
module load to check that the table is sorted correctly.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Volkin, Bradley D

On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
  +static const struct drm_i915_cmd_descriptor*
  +find_cmd_in_table(const struct drm_i915_cmd_table *table,
  + u32 cmd_header)
  +{
  +   int i;
  +
  +   for (i = 0; i  table-count; i++) {
  +   const struct drm_i915_cmd_descriptor *desc = table-table[i];
  +   u32 masked_cmd = desc-cmd.mask  cmd_header;
  +   u32 masked_value = desc-cmd.value  desc-cmd.mask;
  +
  +   if (masked_cmd == masked_value)
  +   return desc;
 
 Maybe pre-sort the cmd table and use bsearch? And a runtime test on
 module load to check that the table is sorted correctly.

So far it doesn't look like the search is a bottleneck, but I've tried to keep
the tables sorted so that we could easily switch to bsearch if needed. Would
you prefer to just use bsearch from the start?

I agree that the module load check makes sense if we do switch.

 -Chris
 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 19:28:27 +0200
Ville Syrjälä ville.syrj...@linux.intel.com wrote:

 On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote:
  And move it up in the file for earlier usage.
  
  v2: add pre-gen4 support as well (Chris)
  
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
  ---
   drivers/gpu/drm/i915/intel_display.c | 53 
  ++--
   1 file changed, 38 insertions(+), 15 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/intel_display.c 
  b/drivers/gpu/drm/i915/intel_display.c
  index e85d838..321d751 100644
  --- a/drivers/gpu/drm/i915/intel_display.c
  +++ b/drivers/gpu/drm/i915/intel_display.c
  @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc 
  *crtc,
  pipe_config-port_clock = clock.dot / 5;
   }
   
  +static u32
  +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int 
  width,
  + int bpp, bool tiled)
  +{
  +   u32 pitch = DIV_ROUND_UP(width * bpp, 8);
  +   int align;
  +
  +   if (tiled) {
  +   if (INTEL_INFO(dev_priv-dev)-gen  4) {
  +   /* Pre-965 needs power of two tile width */
  +   for (align = 512; align  pitch; align = 1)
  +   ;
  +   } else {
  +   align = 512;
  +   }
 
 Gen2 tiles are 128 bytes wide, not 512 bytes.
 
 So maybe something like this:
  if (IS_GEN2()) {
return roundup_power_of_two(max(pitch, 128));
  else if (IS_GEN3())
return roundup_power_of_two(max(pitch, 512));
  else
return ALIGN(pitch, 512);
 
  +   } else {
  +   align = 64;

Ah good old gen2.  Pretty sure we've just treated it as 512 wide in the
past?  But I'm not going to go digging around the DDX to see... :)

I was looking for that POT helper but didn't see it, that makes things
a little nicer.

 Also I just noticed in the docs that gen2/3 only seem to require 32byte
 alignment for linear buffers. But relaxing that would require a change
 to intel_framebuffer_init() as well, so it looks like material for
 another patch. Also would need to be tested on real hardware.

Yeah separate patch.  It'll save us gobs of memory. :)

  -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp)
  +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv,
  +   struct drm_display_mode *mode, int bpp)
   {
  -   u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp);
  +   u32 pitch = intel_framebuffer_pitch_for_width(dev_priv,
  + mode-hdisplay, bpp,
  + false);
  return ALIGN(pitch * mode-vdisplay, PAGE_SIZE);
 
 This should also align the fb height to tile height, otherwise
 intel_framebuffer_init() might reject it.

Hm another existing bug.  Separate patch.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4

2013-11-26 Thread Daniel Vetter

On Mon, Nov 25, 2013 at 03:51:17PM -0800, Jesse Barnes wrote:
 Read out the current plane configuration at init time into a new
 plane_config structure.  This allows us to track any existing
 framebuffers attached to the plane and potentially re-use them in our
 fbdev code for a smooth handoff.
 
 v2: update for new pitch_for_width function (Jesse)
 comment how get_plane_config works with shared fbs (Jesse)
 v3: s/ARGB/XRGB (Ville)
 use pipesrc width/height (Ville)
 fix fourcc comment (Bob)
 use drm_format_plane_cpp (Ville)
 v4: use fb for tracking fb data object (Ville)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/i915_drv.h  |   3 +
  drivers/gpu/drm/i915/intel_display.c | 127 
 ++-
  drivers/gpu/drm/i915/intel_drv.h |   8 +++
  3 files changed, 136 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index 14f250a..6569e93 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -365,6 +365,7 @@ struct drm_i915_error_state {
  
  struct intel_connector;
  struct intel_crtc_config;
 +struct intel_plane_config;
  struct intel_crtc;
  struct intel_limit;
  struct dpll;
 @@ -403,6 +404,8 @@ struct drm_i915_display_funcs {
* fills out the pipe-config with the hw state. */
   bool (*get_pipe_config)(struct intel_crtc *,
   struct intel_crtc_config *);
 + void (*get_plane_config)(struct intel_crtc *,
 +  struct intel_plane_config *);
   int (*crtc_mode_set)(struct drm_crtc *crtc,
int x, int y,
struct drm_framebuffer *old_fb);
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index 321d751..089128b 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -2002,6 +2002,27 @@ unsigned long intel_gen4_compute_page_offset(int *x, 
 int *y,
   }
  }
  
 +int intel_format_to_fourcc(int format)
 +{
 + switch (format) {
 + case DISPPLANE_8BPP:
 + return DRM_FORMAT_C8;
 + case DISPPLANE_BGRX555:
 + return DRM_FORMAT_XRGB1555;
 + case DISPPLANE_BGRX565:
 + return DRM_FORMAT_RGB565;
 + default:
 + case DISPPLANE_BGRX888:
 + return DRM_FORMAT_XRGB;
 + case DISPPLANE_RGBX888:
 + return DRM_FORMAT_XBGR;
 + case DISPPLANE_BGRX101010:
 + return DRM_FORMAT_XRGB2101010;
 + case DISPPLANE_RGBX101010:
 + return DRM_FORMAT_XBGR2101010;
 + }
 +}
 +
  static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer 
 *fb,
int x, int y)
  {
 @@ -5474,6 +5495,95 @@ intel_framebuffer_pitch_for_width(struct 
 drm_i915_private *dev_priv, int width,
   return ALIGN(pitch, align);
  }
  
 +static void i9xx_get_plane_config(struct intel_crtc *crtc,
 +   struct intel_plane_config *plane_config)
 +{
 + struct drm_device *dev = crtc-base.dev;
 + struct drm_i915_private *dev_priv = dev-dev_private;
 + struct drm_i915_gem_object *obj = NULL;
 + struct drm_mode_fb_cmd2 mode_cmd = { 0 };
 + u32 val, base, offset;
 + int pipe = crtc-pipe, plane = crtc-plane;
 + int fourcc, pixel_format;
 +
 + plane_config-fb = kzalloc(sizeof(*plane_config-fb), GFP_KERNEL);
 + if (!plane_config-fb) {
 + DRM_DEBUG_KMS(failed to alloc fb\n);
 + return;
 + }
 +
 + val = I915_READ(DSPCNTR(plane));
 +
 + if (INTEL_INFO(dev)-gen = 4)
 + if (val  DISPPLANE_TILED)
 + plane_config-tiled = true;
 +
 + pixel_format = val  DISPPLANE_PIXFORMAT_MASK;
 + fourcc = intel_format_to_fourcc(pixel_format);
 + plane_config-fb-base.pixel_format = fourcc;
 + plane_config-fb-base.bits_per_pixel =
 + drm_format_plane_cpp(fourcc, 0) * 8;
 +
 + if (INTEL_INFO(dev)-gen = 4) {
 + if (plane_config-tiled)
 + offset = I915_READ(DSPTILEOFF(plane));
 + else
 + offset = I915_READ(DSPLINOFF(plane));
 + base = I915_READ(DSPSURF(plane))  0xf000;
 + } else {
 + base = I915_READ(DSPADDR(plane));
 + }
 +
 + val = I915_READ(PIPESRC(pipe));
 + plane_config-fb-base.width = ((val  16)  0xfff) + 1;
 + plane_config-fb-base.height = ((val  0)  0xfff) + 1;
 +
 + plane_config-fb-base.pitches[0] =
 + intel_framebuffer_pitch_for_width(dev_priv,
 +   plane_config-fb-base.width,
 +   
 plane_config-fb-base.bits_per_pixel,
 +   plane_config-tiled);

Shouldn't we read out the stride from the respective hw register, too?

 +
 +

Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 02:09:48PM +, Chris Wilson wrote:
 On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote:
  Retrieve current framebuffer config info from the regs and create an fb
  object for the buffer the BIOS or boot loader left us.  This should
  allow for smooth transitions to userspace apps once we finish the
  initial configuration construction.
  
  v2: check for non-native modes and adjust (Jesse)
  fixup aperture and cmap frees (Imre)
  use unlocked unref if init_bios fails (Jesse)
  fix curly brace around DSPADDR check (Imre)
  comment failure path for pin_and_fence (Imre)
  v3: fixup fixup of aperture frees (Chris)
  v4: update to current bits (locking  pin_and_fence hack) (Jesse)
  v5: move fb config fetch to display code (Jesse)
  re-order hw state readout on initial load to suit fb inherit (Jesse)
  re-add pin_and_fence in fbdev code to make sure we refcount properly (Je
  v6: rename to plane_config (Daniel)
  check for valid object when initializing BIOS fb (Jesse)
  split from plane_config readout and other display changes (Jesse)
  drop use_bios_fb option (Chris)
  update comments (Jesse)
  rework fbdev_init_bios for clarity (Jesse)
  drop fb obj ref under lock (Chris)
  v7: use fb object from plane_config instead (Ville)
  
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 
 Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding
 the kzalloc(intel_fbdev) and the clarity of
 intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb
 lifetime of plane_config-fb is extremely ugly though (the theft could
 be made a little more obvious for instance) and still leaked upon failure?

I think the lifetime stuff for the stolen fb is actually ok. But there's
other stuff that will probably gives us some good fireworks:
- intel_crtc-plane_config seems to be left hanging in the air. Imo
  duplicating the crtc-fb pointer isnt' really all that good. I'd much
  prefer when we just shovel the invented fb into the crtc-fb pointer. Of
  course that requires us to properly adjust the refcount.
- fb-obj has a very high chance to cause utter havoc on multi-pipe
  configs, since the bios loves to set up shared framebuffers. I guess
  this is untested? For now it's probably simplest to just not bother with
  the 2nd/3rd pipe.
- We need to clean up the fbs we've created somehow - intel_framebuffer_init
  will at least register the framebuffer with the drm core. But since it's a
  driver-private fb and since we hold a reference already it'll never disappear
  I think.
- Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Chris Wilson

On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote:
 On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
  On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com wrote:
   +static const struct drm_i915_cmd_descriptor*
   +find_cmd_in_table(const struct drm_i915_cmd_table *table,
   +   u32 cmd_header)
   +{
   + int i;
   +
   + for (i = 0; i  table-count; i++) {
   + const struct drm_i915_cmd_descriptor *desc = table-table[i];
   + u32 masked_cmd = desc-cmd.mask  cmd_header;
   + u32 masked_value = desc-cmd.value  desc-cmd.mask;
   +
   + if (masked_cmd == masked_value)
   + return desc;
  
  Maybe pre-sort the cmd table and use bsearch? And a runtime test on
  module load to check that the table is sorted correctly.
 
 So far it doesn't look like the search is a bottleneck, but I've tried to keep
 the tables sorted so that we could easily switch to bsearch if needed. Would
 you prefer to just use bsearch from the start?

Yes. I think it will be easier (especially if the codes does the runtime
check) to keep the table sorted as commands are incremently added in the
future, rather than having to retrofit the bsearch when it becomes a
significant problem.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/5] drm/i915: make pitch_for_width take a tiled arg v2

2013-11-26 Thread Daniel Vetter

On Mon, Nov 25, 2013 at 03:51:15PM -0800, Jesse Barnes wrote:
 And move it up in the file for earlier usage.
 
 v2: add pre-gen4 support as well (Chris)
 
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/intel_display.c | 53 
 ++--
  1 file changed, 38 insertions(+), 15 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_display.c 
 b/drivers/gpu/drm/i915/intel_display.c
 index e85d838..321d751 100644
 --- a/drivers/gpu/drm/i915/intel_display.c
 +++ b/drivers/gpu/drm/i915/intel_display.c
 @@ -5452,6 +5452,28 @@ static void vlv_crtc_clock_get(struct intel_crtc *crtc,
   pipe_config-port_clock = clock.dot / 5;
  }
  
 +static u32
 +intel_framebuffer_pitch_for_width(struct drm_i915_private *dev_priv, int 
 width,
 +   int bpp, bool tiled)
 +{
 + u32 pitch = DIV_ROUND_UP(width * bpp, 8);
 + int align;
 +
 + if (tiled) {
 + if (INTEL_INFO(dev_priv-dev)-gen  4) {
 + /* Pre-965 needs power of two tile width */
 + for (align = 512; align  pitch; align = 1)
 + ;
 + } else {
 + align = 512;
 + }

Imo we should just read this out from the hardware registers, that avoids
us to have another copypasted version of some tile size limit code hanging
around.
-Daniel


 + } else {
 + align = 64;
 + }
 +
 + return ALIGN(pitch, align);
 +}
 +
  static bool i9xx_get_pipe_config(struct intel_crtc *crtc,
struct intel_crtc_config *pipe_config)
  {
 @@ -7652,16 +7674,12 @@ err:
  }
  
  static u32
 -intel_framebuffer_pitch_for_width(int width, int bpp)
 -{
 - u32 pitch = DIV_ROUND_UP(width * bpp, 8);
 - return ALIGN(pitch, 64);
 -}
 -
 -static u32
 -intel_framebuffer_size_for_mode(struct drm_display_mode *mode, int bpp)
 +intel_framebuffer_size_for_mode(struct drm_i915_private *dev_priv,
 + struct drm_display_mode *mode, int bpp)
  {
 - u32 pitch = intel_framebuffer_pitch_for_width(mode-hdisplay, bpp);
 + u32 pitch = intel_framebuffer_pitch_for_width(dev_priv,
 +   mode-hdisplay, bpp,
 +   false);
   return ALIGN(pitch * mode-vdisplay, PAGE_SIZE);
  }
  
 @@ -7670,18 +7688,21 @@ intel_framebuffer_create_for_mode(struct drm_device 
 *dev,
 struct drm_display_mode *mode,
 int depth, int bpp)
  {
 + struct drm_i915_private *dev_priv = dev-dev_private;
   struct drm_i915_gem_object *obj;
   struct drm_mode_fb_cmd2 mode_cmd = { 0 };
 + int size;
  
 - obj = i915_gem_alloc_object(dev,
 - intel_framebuffer_size_for_mode(mode, bpp));
 + size =  intel_framebuffer_size_for_mode(dev_priv, mode, bpp);
 + obj = i915_gem_alloc_object(dev, size);
   if (obj == NULL)
   return ERR_PTR(-ENOMEM);
  
   mode_cmd.width = mode-hdisplay;
   mode_cmd.height = mode-vdisplay;
 - mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(mode_cmd.width,
 - bpp);
 + mode_cmd.pitches[0] = intel_framebuffer_pitch_for_width(dev_priv,
 + mode_cmd.width,
 + bpp, false);
   mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth);
  
   return intel_framebuffer_create(dev, mode_cmd, obj);
 @@ -7704,8 +7725,10 @@ mode_fits_in_fbdev(struct drm_device *dev,
   return NULL;
  
   fb = dev_priv-fbdev-ifb.base;
 - if (fb-pitches[0]  intel_framebuffer_pitch_for_width(mode-hdisplay,
 -
 fb-bits_per_pixel))
 + if (fb-pitches[0]  intel_framebuffer_pitch_for_width(dev_priv,
 +mode-hdisplay,
 +
 fb-bits_per_pixel,
 +false))
   return NULL;
  
   if (obj-base.size  mode-vdisplay * fb-pitches[0])
 -- 
 1.8.4.2
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: take mode config lock around crtc disable at suspend

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 09:13:41AM -0800, Jesse Barnes wrote:
 This is just a theoretical issue, but we need to do this to prevent the
 WARN in pipe_from_connector at suspend time.
 
 https://bugs.freedesktop.org/show_bug.cgi?id=71978
 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org

Picked up for -fixes, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion

2013-11-26 Thread Chris Wilson

On Tue, Nov 26, 2013 at 08:51:37AM -0800, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com
 
 The length mask is different for each ring and the size can vary,
 so we should duplicate the definition with the correct encoding
 for each ring.

Jumping in here since this highlights the most confusing aspect of this
series - the meta patching. Please implement the command parsing
infrastructure upfront and in a very small number of patches (most
preferably one) that avoids having to add fixes late in the series.

I think using
s/S/ALLOW/
s/R/REJECT/
s/B/BLACKLIST/
s/W/WHITELIST/
makes the action much more clear, and would rather that every unsafe
command starts off as REJECT. (With the whitelist/blacklisting being
added as separate patches with justification as they are now.)

Since we do disable the security, I would also reject all
unknown/unmatched commands and make ALLOW explicit.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/5] drm/i915: retrieve current fb config into new plane_config structure at init v4

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 18:43:53 +0100
Daniel Vetter dan...@ffwll.ch wrote:
  +   plane_config-fb-base.pitches[0] =
  +   intel_framebuffer_pitch_for_width(dev_priv,
  + plane_config-fb-base.width,
  + 
  plane_config-fb-base.bits_per_pixel,
  + plane_config-tiled);
 
 Shouldn't we read out the stride from the respective hw register, too?

Hm that's quite an idea.

 
  +
  +   plane_config-size = ALIGN(plane_config-fb-base.pitches[0] *
  +  plane_config-fb-base.height, PAGE_SIZE);
 
 If you bother with tiling I think you also need to bother with correct
 size alignement.

Yeah just fixed up the framebuffer size function per Ville's comments.
I should be able to just use that.

  @@ -10562,6 +10672,7 @@ static void intel_init_display(struct drm_device 
  *dev)
  dev_priv-display.update_plane = ironlake_update_plane;
 
 Sad Haswell is sad it seems.

Yeah for two reasons: I'm not testing HSW at all, and I'm still fairly
ignorant of HSW display.  Well that and I don't want to pile on more
work for this patchset which has already seen a bunch of churn...

  +   if (dev_priv-display.get_plane_config)
  +   dev_priv-display.get_plane_config(crtc,
  +  crtc-plane_config);
  +   }
 
 If we go with Chris' suggestion to disable the crtc if we can't get at a
 sane framebuffer for it then this would make much more sense in the
 hardware state readout code. Especially since always having framebuffers
 around would allow us to ditch a bit of special-casing code all over the
 place.

I don't think so; reading out and allocating an fb on every plane
config readout would be overkill.  We'd need to poke around in the
struct for cross checking, then free it somewhere.  Plus it won't
always be preallocated, and creating a duplicate fb when the object
still exists would fail.

This is actually another argument for simply duplicating a few fields
from the fb struct into the plane config struct.  That makes cross
checking and readout simple, and allows us to allocate if possible in
the BIOS functions.

But damnit, then I'd have to drop back to an earlier version of the
patch and lose some changes... ugg

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 0/6] Enabling DRRS support in the kernel

2013-11-26 Thread Daniel Vetter

On Tue, Nov 19, 2013 at 11:36:58AM +0530, Vandana Kannan wrote:
 Dynamic Refresh Rate Switching (DRRS) is a power conservation feature which   
   
 enables switching between low and high refresh rates based on the usage   
   
 scenario. This feature is applicable for internal eDP panel. Indication that  
   
 the panel can support DRRS is given by the panel EDID, which would list   
   
 multiple refresh rates for one resolution.
   
 DRRS is of 2 types -  
   
 Static DRRS involves execution of the entire mode set sequence to switch  
   
 between refresh rate. 
   
 seamless DRRS involves refresh rate switching during runtime without any  
   
 blanking effect/mode set. 
   
 The vendor fills in a VBT field indicating static/seamless DRRS based on the  
   
 panel spec. This information is needed to enable seamless DRRS in kernel. 
   
 The patch series supports idleness detection in display i915 driver and 
 switch  
 to low refresh rate. It also provides set_property API for user space to  

 request for different refresh rates for active use cases like video playback  
   
 at 48/50 Hz.   
 
 
 Pradeep Bhat (5):
   drm/i915: Adding VBT fields to support eDP DRRS feature
   drm/i915: Parse EDID probed modes for DRRS support
   drm/i915: Add support for DRRS set property to switch RR
   drm/i915: Support to read DMRRS field from VBT structure
   drm/i915: Adding support for DMRRS for media playback
 
 Vandana Kannan (1):
   drm/i915: Idleness detection for DRRS

My apologies for delaying my high-level maintainer review for so long -
there's been a bit a firedrill around here so it took a while to write it
all down.

Overall a nice patch series, but I think we need to shuffle a few things
around to better align with some of the longer-term driver architecture
reworks and goals:

- You add another copypaste of the code to deduce the reduced clock from
  the edid to intel_dp.c. Imo it's better to add that to intel_panel.c to
  struct intel_panel - we already track the panel fixed mode in there, so
  this would naturally fit.

- Imo we should track the reduced clock in the pipe config and also
  cross-check it in the state checker. That will lead to a bit of fun on
  bdw, so we need to special case the checker there so that it only checks
  that the clock matches either of the possible clocks.

  For this we need a bool downclock_available in struct intel_crtc_config
  and a 2nd set of m_n values, both set in the compute_config function for
  DP outputs.

  For consistency it'd be nice to at least move the downclock_available
  logic for lvds also over to that. But since we first need to clean up
  the dpll handling to make lvds downclocking sane that's imo not really a
  priority at all.

  The entire point behind all this pipe state tracking is to make sure we
  don't miss anything when fastbooting.

- The connector attribute is imo the wrong interface - userspace already
  supplies a mode attribute with dotclock to SetCrtc. Imo we simply need
  to fix up our fixed_mode logic (preferrably as a neat helper in
  intel_panel.c) to select the right one of the availabel downclocks, even
  when upscaling.

  The downside for now is that this will result in flickering. But we need
  a real flicker-free fast-update-path in our modeset code anyway to make
  fastboot happen for real. And a few other cool things. I'll increase
  the pain a bit in the hope that this moves forward again, so no quick
  hack please (even if it's very simple to do in the case of drrs).

- Finally a quick testcase which iterates through all the downclock modes
  in kms_flip - together with the cross-checking and all the vblank
  timing tests we already have that should give us solid test coverage. To
  keep test runtimes reasonable I think just a pageflipping test with time
  checking is more than enough.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/5] drm/i915: Wrap the preallocated BIOS framebuffer and preserve for KMS fbcon v7

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 18:54:04 +0100
Daniel Vetter dan...@ffwll.ch wrote:

 On Tue, Nov 26, 2013 at 02:09:48PM +, Chris Wilson wrote:
  On Mon, Nov 25, 2013 at 03:51:18PM -0800, Jesse Barnes wrote:
   Retrieve current framebuffer config info from the regs and create an fb
   object for the buffer the BIOS or boot loader left us.  This should
   allow for smooth transitions to userspace apps once we finish the
   initial configuration construction.
   
   v2: check for non-native modes and adjust (Jesse)
   fixup aperture and cmap frees (Imre)
   use unlocked unref if init_bios fails (Jesse)
   fix curly brace around DSPADDR check (Imre)
   comment failure path for pin_and_fence (Imre)
   v3: fixup fixup of aperture frees (Chris)
   v4: update to current bits (locking  pin_and_fence hack) (Jesse)
   v5: move fb config fetch to display code (Jesse)
   re-order hw state readout on initial load to suit fb inherit (Jesse)
   re-add pin_and_fence in fbdev code to make sure we refcount properly 
   (Je
   v6: rename to plane_config (Daniel)
   check for valid object when initializing BIOS fb (Jesse)
   split from plane_config readout and other display changes (Jesse)
   drop use_bios_fb option (Chris)
   update comments (Jesse)
   rework fbdev_init_bios for clarity (Jesse)
   drop fb obj ref under lock (Chris)
   v7: use fb object from plane_config instead (Ville)
   
   Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
  
  Hmm, quietly steals plane_config-fb you mean. Other than bikeshedding
  the kzalloc(intel_fbdev) and the clarity of
  intel_fb_init/intel_fb_init_bios, I don't see anything else. The fb
  lifetime of plane_config-fb is extremely ugly though (the theft could
  be made a little more obvious for instance) and still leaked upon failure?
 
 I think the lifetime stuff for the stolen fb is actually ok. But there's
 other stuff that will probably gives us some good fireworks:
 - intel_crtc-plane_config seems to be left hanging in the air. Imo
   duplicating the crtc-fb pointer isnt' really all that good. I'd much
   prefer when we just shovel the invented fb into the crtc-fb pointer. Of
   course that requires us to properly adjust the refcount.

plane_config is just part of intel_crtc.  No hanging, the struct is
just bigger.  Saves me from dealing with alloc/free of it.

 - fb-obj has a very high chance to cause utter havoc on multi-pipe
   configs, since the bios loves to set up shared framebuffers. I guess
   this is untested? For now it's probably simplest to just not bother with
   the 2nd/3rd pipe.

There should be no havoc.  The first allocation will succeed, and
following ones will fail since they overlap.  The BIOS takeover code
will then use the one that was allocated (or in the unlikely case of
multiple fbs, pick the biggest one).

Or did you see some other fail there?

 - We need to clean up the fbs we've created somehow - intel_framebuffer_init
   will at least register the framebuffer with the drm core. But since it's a
   driver-private fb and since we hold a reference already it'll never 
 disappear
   I think.

I don't think so... it should look just like a user allocated buffer
from the drm core POV.  But generally our fbdev allocations do tend to
live forever, so in that sense you're right, but it's not different
than what we have today.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 0/8] support for multiple power wells

2013-11-26 Thread Paulo Zanoni

2013/11/25 Imre Deak imre.d...@intel.com:
 v2: rebase on latest kernel + debug info on domain use counts
 v3: address review comments from Paulo

All my comments got implemented or got some nice explanations, so, for
the series:

Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com


 Imre Deak (7):
   drm/i915: add audio power domain
   drm/i915: support for multiple power wells
   drm/i915: add always-on power wells instead of special casing them
   drm/i915: use IS_HASWELL/BROADWELL instead of HAS_POWER_WELL
   drm/i915: don't do BDW/HSW specific powerdomains init on other
 platforms
   drm/i915: add a default always-on power well
   drm/i915: add a debugfs entry for power domain info

 Jesse Barnes (1):
   drm/i915: protect HSW power well check with IS_HASWELL in
 redisable_vga

  drivers/gpu/drm/i915/i915_debugfs.c  |  71 +
  drivers/gpu/drm/i915/i915_dma.c  |  18 ++--
  drivers/gpu/drm/i915/i915_drv.h  |  18 +++-
  drivers/gpu/drm/i915/intel_display.c |   6 +-
  drivers/gpu/drm/i915/intel_pm.c  | 198 
 +++
  5 files changed, 227 insertions(+), 84 deletions(-)

 --
 1.8.4

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Paulo Zanoni
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 05/22] drm/i915: Implement command parsing

2013-11-26 Thread Volkin, Bradley D

On Tue, Nov 26, 2013 at 09:56:09AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 09:38:55AM -0800, Volkin, Bradley D wrote:
  On Tue, Nov 26, 2013 at 09:29:32AM -0800, Chris Wilson wrote:
   On Tue, Nov 26, 2013 at 08:51:22AM -0800, bradley.d.vol...@intel.com 
   wrote:
+static const struct drm_i915_cmd_descriptor*
+find_cmd_in_table(const struct drm_i915_cmd_table *table,
+ u32 cmd_header)
+{
+   int i;
+
+   for (i = 0; i  table-count; i++) {
+   const struct drm_i915_cmd_descriptor *desc = 
table-table[i];
+   u32 masked_cmd = desc-cmd.mask  cmd_header;
+   u32 masked_value = desc-cmd.value  desc-cmd.mask;
+
+   if (masked_cmd == masked_value)
+   return desc;
   
   Maybe pre-sort the cmd table and use bsearch? And a runtime test on
   module load to check that the table is sorted correctly.
  
  So far it doesn't look like the search is a bottleneck, but I've tried to 
  keep
  the tables sorted so that we could easily switch to bsearch if needed. Would
  you prefer to just use bsearch from the start?
 
 Yes. I think it will be easier (especially if the codes does the runtime
 check) to keep the table sorted as commands are incremently added in the
 future, rather than having to retrofit the bsearch when it becomes a
 significant problem.

Ok, makes sense. I'll assume the same comment applies to the register 
whitelists and
make similar changes there.

Brad

 -Chris
 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 20/22] drm/i915: Fix MI_STORE_DWORD_IMM parser defintion

2013-11-26 Thread Volkin, Bradley D

On Tue, Nov 26, 2013 at 10:08:48AM -0800, Chris Wilson wrote:
 On Tue, Nov 26, 2013 at 08:51:37AM -0800, bradley.d.vol...@intel.com wrote:
  From: Brad Volkin bradley.d.vol...@intel.com
  
  The length mask is different for each ring and the size can vary,
  so we should duplicate the definition with the correct encoding
  for each ring.
 
 Jumping in here since this highlights the most confusing aspect of this
 series - the meta patching. Please implement the command parsing
 infrastructure upfront and in a very small number of patches (most
 preferably one) that avoids having to add fixes late in the series.

Sure. As this is my first contribution, I'm still working on how to best
split up a series, so I'm happy to clean up that aspect. It sounds like
the series should look more like:
- All parser infrastructure and implementation (basically squash current 1-9,
  plus the bsearch changes, plus REJECT changes)
- N patches to set commands for register whitelist and bitmask checking
- Enable the parser

Correct?

 
 I think using
 s/S/ALLOW/
 s/R/REJECT/
 s/B/BLACKLIST/
 s/W/WHITELIST/
 makes the action much more clear, and would rather that every unsafe
 command starts off as REJECT. (With the whitelist/blacklisting being
 added as separate patches with justification as they are now.)

I had split out the REJECTs to make the justification easier to find
in the commit message, but I can reject them from the start too.

For 'B', the term 'blacklist' to me implies a set of things that are
unconditionally not allowed, whereas the 'B' commands are conditionally
allowed based on the bitmask checks. Are you just asking for a readability
change in expanding the action names used in the table, or are you looking
for any semantic changes to what the parser checks? Or am I overthinking
this comment?

Brad

 
 Since we do disable the security, I would also reject all
 unknown/unmatched commands and make ALLOW explicit.
 -Chris
 
 -- 
 Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote:
 This is one of the nice pieces that I've never ported from the old
 make based test runner. Note that we only use the result of the check
 when actually running the testcases so that enumerating tests still
 works as non-root on arbitrary machines.
 

So does this bail on the first test, or does this run for all tests?
BTW, Damien said he had also implemented this.

 Cc: Ben Widawsky b...@bwidawsk.net
 Requested-by: Ben Widawsky b...@bwidawsk.net
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 ---
  tests/igt.tests | 32 +++-
  1 file changed, 31 insertions(+), 1 deletion(-)
 
 diff --git a/tests/igt.tests b/tests/igt.tests
 index f3884925deaa..8d02c1a60255 100644
 --- a/tests/igt.tests
 +++ b/tests/igt.tests
 @@ -28,7 +28,7 @@ import sys
  import subprocess
  
  from os import path
 -from framework.core import testBinDir, TestProfile
 +from framework.core import testBinDir, TestProfile, TestResult
  from framework.exectest import ExecTest
  
  #
 @@ -39,6 +39,24 @@ from framework.exectest import ExecTest
  # automatically add all tests into the 'igt' category.
  #
  
 +def checkEnvironment():
 +debugfs_path = /sys/kernel/debug/dri
 +if os.getuid() != 0:
 +print Test Environment check: not root!
 +return False
 +if not os.path.isdir(debugfs_path):
 +print Test Environment check: debugfs not mounted properly!
 +return False
 +for subdir in os.listdir(debugfs_path):
 +clients = open(os.path.join(debugfs_path, subdir, clients), 'r')
 +lines = clients.readlines()
 +if len(lines)  2:
 +print Test Environment check: other drm clients running!
 +return False
 +
 +print Test Environment check: Succeeded.
 +return True
 +
  if not os.path.exists(os.path.join(testBinDir, 'igt')):
  print igt symlink not found!
  sys.exit(0)
 @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')):
  # Chase the piglit/bin/igt symlink to find where the tests really live.
  igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests')
  
 +igtEnvironmentOk = checkEnvironment()
 +
  profile = TestProfile()
  
  class IGTTest(ExecTest):
 @@ -54,6 +74,9 @@ class IGTTest(ExecTest):
  self.timeout = 60*20 # 20 minutes deadline by default
  
  def interpretResult(self, out, returncode, results, dmesg):
 +if not igtEnvironmentOk:
 +return out
 +
  if returncode == 0:
  results['result'] = 'dmesg-warn' if dmesg != '' else 'pass'
  elif returncode == 77:
 @@ -63,6 +86,13 @@ class IGTTest(ExecTest):
  return out
  def run(self, env):
  env.dmesg = True
 +if not igtEnvironmentOk:
 +results = TestResult()
 +results['result'] = 'skip'
 +results['info'] = unicode(Test Environment isn't OK)
 +
 +return results
 +
  return ExecTest.run(self, env)
  
  def listTests(listname):
 -- 
 1.8.1.4
 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v3 0/8] support for multiple power wells

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 04:46:54PM -0200, Paulo Zanoni wrote:
 2013/11/25 Imre Deak imre.d...@intel.com:
  v2: rebase on latest kernel + debug info on domain use counts
  v3: address review comments from Paulo
 
 All my comments got implemented or got some nice explanations, so, for
 the series:
 
 Reviewed-by: Paulo Zanoni paulo.r.zan...@intel.com

Slurped them all in, thanks for patchesreview.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: drop DRM_ERROR in intel_fbdev init

2013-11-26 Thread Jesse Barnes

This should just be a debug.  Add another debug msg to the inherit path
while we're at it.

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/gpu/drm/i915/intel_fbdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c 
b/drivers/gpu/drm/i915/intel_fbdev.c
index fdb6dc9..284c3eb 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -127,11 +127,12 @@ static int intelfb_create(struct drm_fb_helper *helper,
mutex_lock(dev-struct_mutex);
 
if (!intel_fb-obj) {
-   DRM_ERROR(no BIOS fb, allocating a new one\n);
+   DRM_DEBUG_KMS(no BIOS fb, allocating a new one\n);
ret = intelfb_alloc(helper, sizes);
if (ret)
goto out_unlock;
} else {
+   DRM_DEBUG_KMS(re-using BIOS fb\n);
sizes-fb_width = intel_fb-base.width;
sizes-fb_height = intel_fb-base.height;
}
-- 
1.8.4.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

2013-11-26 Thread Daniel Vetter

Hi Brad,

On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote:
 From: Brad Volkin bradley.d.vol...@intel.com
 
 Certain OpenGL features (e.g. transform feedback, performance monitoring)
 require userspace code to submit batches containing commands such as
 MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
 generations of the hardware will noop these commands in unsecure batches
 (which includes all userspace batches submitted via i915) even though the
 commands may be safe and represent the intended programming model of the 
 device.
 
 This series introduces a software command parser similar in operation to the
 command parsing done in hardware for unsecure batches. However, the software
 parser allows some operations that would be noop'd by hardware, if the parser
 determines the operation is safe, and submits the batch as secure to prevent
 hardware parsing. Currently the series implements this on IVB and HSW.
 
 The series is divided into several phases:
 
 patches 01-09: These implement infrastructure and the command parsing 
 algorithm,
all behind a module parameter. I expect some discussion and
  rework, but hopefully there's nothing too controversial.
 patches 10-17: These define the checks performed by the parser.
I expect much discussion :)
 patches 18-20: In a final pass over the command checks, I found some issues 
 with
the definitions. They looked painful to rebase in, so I've 
 added
  them here.
 patches 21-22: These enable the parser by default. It runs on all batches 
 except
those that set the I915_EXEC_SECURE flag in the execbuffer2 
 call.

I think long-term we should even scan secure batches. We'd need to allow
some registers which only the drm master (i.e. owner of the display
hardware) is allowed to do, e.g. for scanline waits. But once we have that
we should be able to port all current users of secure batches over to
scanned batches and so enforce this everywhere by default.

The other issue is that igt tests assume to be able to run some evil
tests, so maybe we don't actually want this.

 There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very
 basic and do not test all of the commands used by the parser on the assumption
 that I'm likely to make the same mistakes in both the parser and the test.

Yeah, I agree that just checking whether commands all go through (or not)
as expected adds very little value on top of the few tests you have done.
I think we should take a look at some corner cases which might trip up
your checker a bit though:
- I think we should check batchbuffer chaining and make sure it works on
  the vcs ring and not anywhere else (we can't ever break shipping libva
  which uses this).
- Some tests to trip up your parser should be done, like 3D commands that
  fall off the end of the batch bo. Or commands that span page boundaries.
  The later isn't an issue atm since you use vmap, but we should switch to
  per-page kmap since the vmap overhead is fairly horrible.

 I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few
 days ago), and generally used an Ubuntu 13.10 IVB system with the parser
 running. Aside from a failure described below, I don't think there are any
 regressions. That is, piglit claims some regressions, but from manually 
 running
 the tests I think these are false positives. However, I could use help in
 getting broader testing, particularly around performance. In general, I see 
 less
 than 3% performance impact on HSW, with more like 10% impact for pathological
 batch sizes. But we'll certainly want to run relevant benchmarks beyond what
 I've done.

Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes
through the checker and bypassing it (with EXEC_SECURE) would be really
good. Maybe even some variable-sized commands (all the state setup stuff
should be useful for that) to keep things interesting. Some variation is
also important to have some good cache thrasing going on (since your check
tables are fairly large I think).

 At this point there are a couple of known issues and potential improvements.
 
 1) VLV. The parser is currently disabled for VLV. One type of check performed 
 by
the parser is that commands which access memory do so via PPGTT. VLV does 
 not
have PPGTT enabled at this time. I chose to implement the PPGTT checks via
generic bit checking infrastructure in the parser, so they are not easily
disabled for VLV. For now, I'm disabling parsing altogether in the hope 
 that
PPGTT can be enabled for VLV in the near future.

We need ppgtt for the parser anyway, since otherwise userspace can submit
a self-modifying batch. Checking for that is impossible, so allowing sw
checked batches without the ppgtt/ggtt split would be a decent security
hole.

 2) Coherency. I've found two types of coherency issues when reading the

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 11:05:57AM -0800, Ben Widawsky wrote:
 On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote:
  This is one of the nice pieces that I've never ported from the old
  make based test runner. Note that we only use the result of the check
  when actually running the testcases so that enumerating tests still
  works as non-root on arbitrary machines.
  
 
 So does this bail on the first test, or does this run for all tests?

It bails each test individually. Since otherwise enumerating them wouldn't
work any more. I've hoped the commit message was clear about it.

 BTW, Damien said he had also implemented this.

Only in the make target he created, not in piglit itself. Imo we should
have all the testrunner logic in one place, i.e. in the piglit sources.
-Daniel

 
  Cc: Ben Widawsky b...@bwidawsk.net
  Requested-by: Ben Widawsky b...@bwidawsk.net
  Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
  ---
   tests/igt.tests | 32 +++-
   1 file changed, 31 insertions(+), 1 deletion(-)
  
  diff --git a/tests/igt.tests b/tests/igt.tests
  index f3884925deaa..8d02c1a60255 100644
  --- a/tests/igt.tests
  +++ b/tests/igt.tests
  @@ -28,7 +28,7 @@ import sys
   import subprocess
   
   from os import path
  -from framework.core import testBinDir, TestProfile
  +from framework.core import testBinDir, TestProfile, TestResult
   from framework.exectest import ExecTest
   
   
  #
  @@ -39,6 +39,24 @@ from framework.exectest import ExecTest
   # automatically add all tests into the 'igt' category.
   
  #
   
  +def checkEnvironment():
  +debugfs_path = /sys/kernel/debug/dri
  +if os.getuid() != 0:
  +print Test Environment check: not root!
  +return False
  +if not os.path.isdir(debugfs_path):
  +print Test Environment check: debugfs not mounted properly!
  +return False
  +for subdir in os.listdir(debugfs_path):
  +clients = open(os.path.join(debugfs_path, subdir, clients), 'r')
  +lines = clients.readlines()
  +if len(lines)  2:
  +print Test Environment check: other drm clients running!
  +return False
  +
  +print Test Environment check: Succeeded.
  +return True
  +
   if not os.path.exists(os.path.join(testBinDir, 'igt')):
   print igt symlink not found!
   sys.exit(0)
  @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')):
   # Chase the piglit/bin/igt symlink to find where the tests really live.
   igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 
  'tests')
   
  +igtEnvironmentOk = checkEnvironment()
  +
   profile = TestProfile()
   
   class IGTTest(ExecTest):
  @@ -54,6 +74,9 @@ class IGTTest(ExecTest):
   self.timeout = 60*20 # 20 minutes deadline by default
   
   def interpretResult(self, out, returncode, results, dmesg):
  +if not igtEnvironmentOk:
  +return out
  +
   if returncode == 0:
   results['result'] = 'dmesg-warn' if dmesg != '' else 'pass'
   elif returncode == 77:
  @@ -63,6 +86,13 @@ class IGTTest(ExecTest):
   return out
   def run(self, env):
   env.dmesg = True
  +if not igtEnvironmentOk:
  +results = TestResult()
  +results['result'] = 'skip'
  +results['info'] = unicode(Test Environment isn't OK)
  +
  +return results
  +
   return ExecTest.run(self, env)
   
   def listTests(listname):
  -- 
  1.8.1.4
  
 
 -- 
 Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 08:46:19PM +0100, Daniel Vetter wrote:
 On Tue, Nov 26, 2013 at 11:05:57AM -0800, Ben Widawsky wrote:
  On Tue, Nov 26, 2013 at 09:09:04AM +0100, Daniel Vetter wrote:
   This is one of the nice pieces that I've never ported from the old
   make based test runner. Note that we only use the result of the check
   when actually running the testcases so that enumerating tests still
   works as non-root on arbitrary machines.
   
  
  So does this bail on the first test, or does this run for all tests?
 
 It bails each test individually. Since otherwise enumerating them wouldn't
 work any more. I've hoped the commit message was clear about it.

It wasn't clear to me, though it seemed to be the case from the diff.
I'd prefer an early bail, and just check once, and ideally, not
generating a JSON file at all.

 
  BTW, Damien said he had also implemented this.
 
 Only in the make target he created, not in piglit itself. Imo we should
 have all the testrunner logic in one place, i.e. in the piglit sources.
 -Daniel

Damien, can you comment? I could have sworn you said something different
on IRC. It sounded like exactly what I wanted.

 
  
   Cc: Ben Widawsky b...@bwidawsk.net
   Requested-by: Ben Widawsky b...@bwidawsk.net
   Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
   ---
tests/igt.tests | 32 +++-
1 file changed, 31 insertions(+), 1 deletion(-)
   
   diff --git a/tests/igt.tests b/tests/igt.tests
   index f3884925deaa..8d02c1a60255 100644
   --- a/tests/igt.tests
   +++ b/tests/igt.tests
   @@ -28,7 +28,7 @@ import sys
import subprocess

from os import path
   -from framework.core import testBinDir, TestProfile
   +from framework.core import testBinDir, TestProfile, TestResult
from framework.exectest import ExecTest


   #
   @@ -39,6 +39,24 @@ from framework.exectest import ExecTest
# automatically add all tests into the 'igt' category.

   #

   +def checkEnvironment():
   +debugfs_path = /sys/kernel/debug/dri
   +if os.getuid() != 0:
   +print Test Environment check: not root!
   +return False
   +if not os.path.isdir(debugfs_path):
   +print Test Environment check: debugfs not mounted properly!
   +return False
   +for subdir in os.listdir(debugfs_path):
   +clients = open(os.path.join(debugfs_path, subdir, clients), 
   'r')
   +lines = clients.readlines()
   +if len(lines)  2:
   +print Test Environment check: other drm clients running!
   +return False
   +
   +print Test Environment check: Succeeded.
   +return True
   +
if not os.path.exists(os.path.join(testBinDir, 'igt')):
print igt symlink not found!
sys.exit(0)
   @@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')):
# Chase the piglit/bin/igt symlink to find where the tests really live.
igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 
   'tests')

   +igtEnvironmentOk = checkEnvironment()
   +
profile = TestProfile()

class IGTTest(ExecTest):
   @@ -54,6 +74,9 @@ class IGTTest(ExecTest):
self.timeout = 60*20 # 20 minutes deadline by default

def interpretResult(self, out, returncode, results, dmesg):
   +if not igtEnvironmentOk:
   +return out
   +
if returncode == 0:
results['result'] = 'dmesg-warn' if dmesg != '' else 'pass'
elif returncode == 77:
   @@ -63,6 +86,13 @@ class IGTTest(ExecTest):
return out
def run(self, env):
env.dmesg = True
   +if not igtEnvironmentOk:
   +results = TestResult()
   +results['result'] = 'skip'
   +results['info'] = unicode(Test Environment isn't OK)
   +
   +return results
   +
return ExecTest.run(self, env)

def listTests(listname):
   -- 
   1.8.1.4
   
  
  -- 
  Ben Widawsky, Intel Open Source Technology Center
 
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Damien Lespiau

On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote:
  Only in the make target he created, not in piglit itself. Imo we should
  have all the testrunner logic in one place, i.e. in the piglit sources.
  -Daniel
 
 Damien, can you comment? I could have sworn you said something different
 on IRC. It sounded like exactly what I wanted.

What Daniel says is correct, the check is part of the runner wrapper,
not piglit itself.

I'd rather have a environement check up-front and I don't mind where it
lives (igt Vs piglit).

-- 
Damien
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 8:53 PM, Damien Lespiau
damien.lesp...@intel.com wrote:
 On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote:
  Only in the make target he created, not in piglit itself. Imo we should
  have all the testrunner logic in one place, i.e. in the piglit sources.
  -Daniel

 Damien, can you comment? I could have sworn you said something different
 on IRC. It sounded like exactly what I wanted.

 What Daniel says is correct, the check is part of the runner wrapper,
 not piglit itself.

 I'd rather have a environement check up-front and I don't mind where it
 lives (igt Vs piglit).

The problem is that generating the testlist (or printing the commands)
is a feature QA actually relies on. I also use it occasionally to
quickly test igt library changes. So we can't bail that early. My
patch bails fairly late, but I didn't see a better spot.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 0/4] xf86-video-intel DRI3 and Present patch series

2013-11-26 Thread Keith Packard

Chris Wilson ch...@chris-wilson.co.uk writes:

  [PATCH 1/4] Support 64-bit MSC values. Handle kernel vageries about

 Some spurious assignments that appear to intentially drop the error code
 could be clarified,

I can't find any dropped error codes in this patch to add comments to,
please provide patch excerpts for this review.

 and intel_crtc_msc_to_sequence() is always called
 with a derived current_msc already to hand. The latter present path
 obfuscates its derived current_msc.

Present always computes absolute MSC values and provides those to the
driver, instead of expecting every driver to duplicate that logic.

  [PATCH 2/4] Restructure DRM event handling.

 This won't compile against older Xorg due to xorg_list in the common
 code.

Can switch to intel_list, but that would need list_for_each_entry_safe
added. How many versions back is this supposed to compile against?

  [PATCH 3/4] Add DRI3 and miSyncShm support

 O_CLOEXEC needs protecting, also would appear to be candidate for a
 render-node.

Yes, obviously this wants to use render-node. I haven't had complaints
about O_CLOEXEC from BSD or Solaris developers for libxshmfence; what
systems do not have support for this?

 The imported and exported DRI3 pixmaps need to be pinned
 to prevent the driver using BO exchanges on that pixmap.

I don't understand this comment.

 DRI3 doesn't respect the xorg.conf Option for disabling.

Ok, it should check intel-directRenderingType == DRI_DISABLED.

 A fence is only tied to a
 screen and no XID or Client in particular?

DRI3 fences are screen-specific (otherwise you'd have no way of hooking
the fence to a specific driver).

 So it is a global operation
 akin to intel_flush_callback() which would be called before the Sync
 reply was sent.

Yes, the hardware queue is to be flushed before the Sync event is sent
(and before the xshmfence object is triggered, of course). Note that
this is just the mi version of sync fences, which use libxshmfence; the
driver is free to use different code there. If we find that the code for
handling these xshmfence objects is common across drivers, we can move
that into the X server to share.

  [PATCH 4/4] Add Present extension support

 Yikes. The patch is itself fairly innoculous, but only because the Present
 extension in the server appears to be repeating the worst of DRI2,
 including its original bugs.

Please provide more specific comments here.

 The fallback/non-fullscreen case is not
 synchronised to screen refresh (if the Client so desired), and should
 be passed through to the driver.

The fallback case is synchronized as the Present code triggers the
CopyArea call from the vblank hook. In practice, this has proven
sufficient to get images onto the screen without tearing and without
requiring a huge amount of driver and kernel infrastructure.

 The whole driver interface seems to be too low a level, baking in many
 assumptions, rather the usual approach of providing a set of mi
 routines that the driver can plug into or not as the case may be.

Patches to the X server to change the API for better hardware support
are welcome, of course.

 That the WindowPixmap no longer points to the actual bo leads
 to a few problems, such as the CRTC misconfiguration and GetImage being
 broken after a PresentFlip.

A patch for the X server to fix that has been posted.

 After a vblank_event, Present must check that
 the flip is still valid before execution.

The flip proc may return FALSE to indicate failure of any kind. Present
will then fall-back to a simple blt.

 In the backend it is not clear whether the RRCrtc should be the
 primary CRTC or the only CRTC to flip.

There is only one screen pixmap, so of course every CRTC must flip
together. The CRTC provided indicates which one the MSC is from.

 Damage is processed after the fallback but not the Flip path, the lack
 of Damage notification would upset Prime amongst others.

Sounds easy to fix in the X server.

Thanks for your review!

-- 
keith.pack...@intel.com


pgpVceKq_Q16e.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 08:55:40PM +0100, Daniel Vetter wrote:
 On Tue, Nov 26, 2013 at 8:53 PM, Damien Lespiau
 damien.lesp...@intel.com wrote:
  On Tue, Nov 26, 2013 at 11:49:43AM -0800, Ben Widawsky wrote:
   Only in the make target he created, not in piglit itself. Imo we should
   have all the testrunner logic in one place, i.e. in the piglit sources.
   -Daniel
 
  Damien, can you comment? I could have sworn you said something different
  on IRC. It sounded like exactly what I wanted.
 
  What Daniel says is correct, the check is part of the runner wrapper,
  not piglit itself.
 
  I'd rather have a environement check up-front and I don't mind where it
  lives (igt Vs piglit).
 
 The problem is that generating the testlist (or printing the commands)
 is a feature QA actually relies on. I also use it occasionally to
 quickly test igt library changes. So we can't bail that early. My
 patch bails fairly late, but I didn't see a better spot.
 -Daniel

I'm confused then about how this really improves my current situation.

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

2013-11-26 Thread Volkin, Bradley D

On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote:
 Hi Brad,
 
 On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote:
  From: Brad Volkin bradley.d.vol...@intel.com
  
  Certain OpenGL features (e.g. transform feedback, performance monitoring)
  require userspace code to submit batches containing commands such as
  MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
  generations of the hardware will noop these commands in unsecure batches
  (which includes all userspace batches submitted via i915) even though the
  commands may be safe and represent the intended programming model of the 
  device.
  
  This series introduces a software command parser similar in operation to the
  command parsing done in hardware for unsecure batches. However, the software
  parser allows some operations that would be noop'd by hardware, if the 
  parser
  determines the operation is safe, and submits the batch as secure to 
  prevent
  hardware parsing. Currently the series implements this on IVB and HSW.
  
  The series is divided into several phases:
  
  patches 01-09: These implement infrastructure and the command parsing 
  algorithm,
 all behind a module parameter. I expect some discussion and
 rework, but hopefully there's nothing too controversial.
  patches 10-17: These define the checks performed by the parser.
 I expect much discussion :)
  patches 18-20: In a final pass over the command checks, I found some issues 
  with
 the definitions. They looked painful to rebase in, so I've 
  added
 them here.
  patches 21-22: These enable the parser by default. It runs on all batches 
  except
 those that set the I915_EXEC_SECURE flag in the execbuffer2 
  call.
 
 I think long-term we should even scan secure batches. We'd need to allow
 some registers which only the drm master (i.e. owner of the display
 hardware) is allowed to do, e.g. for scanline waits. But once we have that
 we should be able to port all current users of secure batches over to
 scanned batches and so enforce this everywhere by default.
 
 The other issue is that igt tests assume to be able to run some evil
 tests, so maybe we don't actually want this.

Agreed. I thought we could handle this as a follow-up task once the basic stuff 
is
in place, particularly given that we'd want to modify at least some users to 
test.
I also wasn't sure if we would want the check to be root  master, as in the 
current
secure flag, or just master.

W.r.t. the tests, I suppose we can just turn checking on for secure batches and 
see
what happens.

 
  There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very
  basic and do not test all of the commands used by the parser on the 
  assumption
  that I'm likely to make the same mistakes in both the parser and the test.
 
 Yeah, I agree that just checking whether commands all go through (or not)
 as expected adds very little value on top of the few tests you have done.
 I think we should take a look at some corner cases which might trip up
 your checker a bit though:
 - I think we should check batchbuffer chaining and make sure it works on
   the vcs ring and not anywhere else (we can't ever break shipping libva
   which uses this).
 - Some tests to trip up your parser should be done, like 3D commands that
   fall off the end of the batch bo. Or commands that span page boundaries.
   The later isn't an issue atm since you use vmap, but we should switch to
   per-page kmap since the vmap overhead is fairly horrible.

Good suggestions. I'll look into these.

 
  I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a 
  few
  days ago), and generally used an Ubuntu 13.10 IVB system with the parser
  running. Aside from a failure described below, I don't think there are any
  regressions. That is, piglit claims some regressions, but from manually 
  running
  the tests I think these are false positives. However, I could use help in
  getting broader testing, particularly around performance. In general, I see 
  less
  than 3% performance impact on HSW, with more like 10% impact for 
  pathological
  batch sizes. But we'll certainly want to run relevant benchmarks beyond what
  I've done.
 
 Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes
 through the checker and bypassing it (with EXEC_SECURE) would be really
 good. Maybe even some variable-sized commands (all the state setup stuff
 should be useful for that) to keep things interesting. Some variation is
 also important to have some good cache thrasing going on (since your check
 tables are fairly large I think).

Ok. I'd be interested in some comment from the mesa and libva guys here for 
real world
workloads, but a microbenchmark would be a good start.

Which state setup stuff are you referring to? Something specific in i-g-t or 
something
more general?

 
  At this point

[Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Daniel Vetter

This is one of the nice pieces that I've never ported from the old
make based test runner. Note that we only use the result of the check
when actually running the testcases so that enumerating tests still
works as non-root on arbitrary machines.

v2: Fail the tests harder.

Cc: Ben Widawsky b...@bwidawsk.net
Requested-by: Ben Widawsky b...@bwidawsk.net
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 tests/igt.tests | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/tests/igt.tests b/tests/igt.tests
index f3884925deaa..df747e3fac78 100644
--- a/tests/igt.tests
+++ b/tests/igt.tests
@@ -28,7 +28,7 @@ import sys
 import subprocess
 
 from os import path
-from framework.core import testBinDir, TestProfile
+from framework.core import testBinDir, TestProfile, TestResult
 from framework.exectest import ExecTest
 
 #
@@ -39,6 +39,24 @@ from framework.exectest import ExecTest
 # automatically add all tests into the 'igt' category.
 #
 
+def checkEnvironment():
+debugfs_path = /sys/kernel/debug/dri
+if os.getuid() != 0:
+print Test Environment check: not root!
+return False
+if not os.path.isdir(debugfs_path):
+print Test Environment check: debugfs not mounted properly!
+return False
+for subdir in os.listdir(debugfs_path):
+clients = open(os.path.join(debugfs_path, subdir, clients), 'r')
+lines = clients.readlines()
+if len(lines)  2:
+print Test Environment check: other drm clients running!
+return False
+
+print Test Environment check: Succeeded.
+return True
+
 if not os.path.exists(os.path.join(testBinDir, 'igt')):
 print igt symlink not found!
 sys.exit(0)
@@ -46,6 +64,8 @@ if not os.path.exists(os.path.join(testBinDir, 'igt')):
 # Chase the piglit/bin/igt symlink to find where the tests really live.
 igtTestRoot = path.join(path.realpath(path.join(testBinDir, 'igt')), 'tests')
 
+igtEnvironmentOk = checkEnvironment()
+
 profile = TestProfile()
 
 class IGTTest(ExecTest):
@@ -54,6 +74,9 @@ class IGTTest(ExecTest):
 self.timeout = 60*20 # 20 minutes deadline by default
 
 def interpretResult(self, out, returncode, results, dmesg):
+if not igtEnvironmentOk:
+return out
+
 if returncode == 0:
 results['result'] = 'dmesg-warn' if dmesg != '' else 'pass'
 elif returncode == 77:
@@ -63,6 +86,13 @@ class IGTTest(ExecTest):
 return out
 def run(self, env):
 env.dmesg = True
+if not igtEnvironmentOk:
+results = TestResult()
+results['result'] = 'fail'
+results['info'] = unicode(Test Environment isn't OK)
+
+return results
+
 return ExecTest.run(self, env)
 
 def listTests(listname):
-- 
1.8.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] tests/igt: Add runtime environment checks

2013-11-26 Thread Daniel Vetter

On Tue, Nov 26, 2013 at 9:02 PM, Ben Widawsky b...@bwidawsk.net wrote:
 The problem is that generating the testlist (or printing the commands)
 is a feature QA actually relies on. I also use it occasionally to
 quickly test igt library changes. So we can't bail that early. My
 patch bails fairly late, but I didn't see a better spot.
 -Daniel

 I'm confused then about how this really improves my current situation.

Quick recap of our irc discussion: I guess I'm trying to solve a
slightly different problem, assuming that we always want some kind of
test result. Hence everything fails (as of v2) and the result json
contains the reason. I think that's the right thing to do for
regression testing in a lab-like setting.

For developers I guess a little script or so some option to yell
louder and faster than this here could still be useful. Otoh there'll
always be test-specific failure modes (like runtime pm not configured
properly) and imo it doesn't make much sense to filter for all of them
beforehand. And otherwise I kinda expect that people will cook their
own scripts anyway to kill X, Wayland and a bunch of deamons that get
in the way and then run piglit with a few default options.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric

2013-11-26 Thread Jesse Barnes

On Tue, 26 Nov 2013 15:10:22 -0800
Jesse Barnes jbar...@virtuousgeek.org wrote:

 On Mon, 11 Nov 2013 10:23:24 +0100
 Daniel Vetter dan...@ffwll.ch wrote:
 
  On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote:
   On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote:
vlv_dpio_read/write should be describe more in PHY centric instead of
display controller centric.
Create a enum dpio_channel for channel index and enum dpio_phy for PHY
index.  This should better to gather for upcoming platform.

v2: Rebase the code based on
drm/i915/vlv: Fix typo in the DPIO register define.

v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro
DPIO_PHY, and remove unrelated change. (Ville)

Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com
Signed-off-by: Chon Ming Lee chon.ming@intel.com
   
   Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Queued for -next, thanks for the patch.
 
 Looks like this one gives me bogus DPIO values at least some of the
 time.  Reverting to using 0x12 as the port ID seems to get me valid
 values back...

Ah looks like the init_dpio happens too late for the mode state
readout.  I'll post a patch to move it up.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric

2013-11-26 Thread Jesse Barnes

On Mon, 11 Nov 2013 10:23:24 +0100
Daniel Vetter dan...@ffwll.ch wrote:

 On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote:
  On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote:
   vlv_dpio_read/write should be describe more in PHY centric instead of
   display controller centric.
   Create a enum dpio_channel for channel index and enum dpio_phy for PHY
   index.  This should better to gather for upcoming platform.
   
   v2: Rebase the code based on
   drm/i915/vlv: Fix typo in the DPIO register define.
   
   v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro
   DPIO_PHY, and remove unrelated change. (Ville)
   
   Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com
   Signed-off-by: Chon Ming Lee chon.ming@intel.com
  
  Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
 
 Queued for -next, thanks for the patch.

Looks like this one gives me bogus DPIO values at least some of the
time.  Reverting to using 0x12 as the port ID seems to get me valid
values back...

-- 
Jesse Barnes, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Ben Widawsky

If we end up calling the shrinker, which in turn requires the OOM
killer, we may end up infinitely waiting for a process to die if the OOM
chooses. The case that this prevents occurs in execbuf. The forked
variants of gem_evict_everything is a good way to hit it. This is
exacerbated by Daniel's recent patch to give OOM precedence to the GEM
tests.

It's a twisted form of a deadlock.

What occurs is the following (assume just 2 procs)
1. proc A gets to execbuf while out of memory, gets struct_mutex.
2. OOM killer comes in and chooses proc B
3. proc B closes it's fds, which requires struct mutex, blocks
4, OOM killer waits for B to die before killing another process (this
part is speculative)

Cc: Daniel Vetter daniel.vet...@ffwll.ch
Cc: Chris Wilson ch...@chris-wilson.co.uk
Signed-off-by: Ben Widawsky b...@bwidawsk.net
---
 drivers/gpu/drm/i915/i915_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fb2d548..a60894d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct 
drm_i915_gem_object *obj)
BUG_ON(obj-base.read_domains  I915_GEM_GPU_DOMAINS);
BUG_ON(obj-base.write_domain  I915_GEM_GPU_DOMAINS);
 
-   st = kmalloc(sizeof(*st), GFP_KERNEL);
+   st = kmalloc(sizeof(*st), GFP_NOWAIT);
if (st == NULL)
return -ENOMEM;
 
page_count = obj-base.size / PAGE_SIZE;
-   if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
+   if (sg_alloc_table(st, page_count, GFP_NOWAIT)) {
kfree(st);
return -ENOMEM;
}
-- 
1.8.4.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
 If we end up calling the shrinker, which in turn requires the OOM
 killer, we may end up infinitely waiting for a process to die if the OOM
 chooses. The case that this prevents occurs in execbuf. The forked
 variants of gem_evict_everything is a good way to hit it. This is
 exacerbated by Daniel's recent patch to give OOM precedence to the GEM
 tests.
 
 It's a twisted form of a deadlock.
 
 What occurs is the following (assume just 2 procs)
 1. proc A gets to execbuf while out of memory, gets struct_mutex.
 2. OOM killer comes in and chooses proc B
 3. proc B closes it's fds, which requires struct mutex, blocks
 4, OOM killer waits for B to die before killing another process (this
 part is speculative)
 

It appears that by itself, this patch is insufficient to prevent the
failure when run in piglit. Before I go on a wild goose chase of finding
all allocations, maybe I'll give people a chance to chime in. The
symptom is the same always, OOM, procs can't die because struct_mutex is
held.

 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Chris Wilson ch...@chris-wilson.co.uk
 Signed-off-by: Ben Widawsky b...@bwidawsk.net
 ---
  drivers/gpu/drm/i915/i915_gem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
 index fb2d548..a60894d 100644
 --- a/drivers/gpu/drm/i915/i915_gem.c
 +++ b/drivers/gpu/drm/i915/i915_gem.c
 @@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct 
 drm_i915_gem_object *obj)
   BUG_ON(obj-base.read_domains  I915_GEM_GPU_DOMAINS);
   BUG_ON(obj-base.write_domain  I915_GEM_GPU_DOMAINS);
  
 - st = kmalloc(sizeof(*st), GFP_KERNEL);
 + st = kmalloc(sizeof(*st), GFP_NOWAIT);
   if (st == NULL)
   return -ENOMEM;
  
   page_count = obj-base.size / PAGE_SIZE;
 - if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
 + if (sg_alloc_table(st, page_count, GFP_NOWAIT)) {
   kfree(st);
   return -ENOMEM;
   }
 -- 
 1.8.4.2
 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 05:10:38PM -0800, Ben Widawsky wrote:
 On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
  If we end up calling the shrinker, which in turn requires the OOM
  killer, we may end up infinitely waiting for a process to die if the OOM
  chooses. The case that this prevents occurs in execbuf. The forked
  variants of gem_evict_everything is a good way to hit it. This is
  exacerbated by Daniel's recent patch to give OOM precedence to the GEM
  tests.
  
  It's a twisted form of a deadlock.
  
  What occurs is the following (assume just 2 procs)
  1. proc A gets to execbuf while out of memory, gets struct_mutex.
  2. OOM killer comes in and chooses proc B
  3. proc B closes it's fds, which requires struct mutex, blocks
  4, OOM killer waits for B to die before killing another process (this
  part is speculative)
  
 
 It appears that by itself, this patch is insufficient to prevent the
 failure when run in piglit. Before I go on a wild goose chase of finding
 all allocations, maybe I'll give people a chance to chime in. The
 symptom is the same always, OOM, procs can't die because struct_mutex is
 held.
 

too late on the goose chase. vma allocation was the missing bit.

  Cc: Daniel Vetter daniel.vet...@ffwll.ch
  Cc: Chris Wilson ch...@chris-wilson.co.uk
  Signed-off-by: Ben Widawsky b...@bwidawsk.net
  ---
   drivers/gpu/drm/i915/i915_gem.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_gem.c 
  b/drivers/gpu/drm/i915/i915_gem.c
  index fb2d548..a60894d 100644
  --- a/drivers/gpu/drm/i915/i915_gem.c
  +++ b/drivers/gpu/drm/i915/i915_gem.c
  @@ -1842,12 +1842,12 @@ i915_gem_object_get_pages_gtt(struct 
  drm_i915_gem_object *obj)
  BUG_ON(obj-base.read_domains  I915_GEM_GPU_DOMAINS);
  BUG_ON(obj-base.write_domain  I915_GEM_GPU_DOMAINS);
   
  -   st = kmalloc(sizeof(*st), GFP_KERNEL);
  +   st = kmalloc(sizeof(*st), GFP_NOWAIT);
  if (st == NULL)
  return -ENOMEM;
   
  page_count = obj-base.size / PAGE_SIZE;
  -   if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
  +   if (sg_alloc_table(st, page_count, GFP_NOWAIT)) {
  kfree(st);
  return -ENOMEM;
  }
  -- 
  1.8.4.2
  
 
 -- 
 Ben Widawsky, Intel Open Source Technology Center
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

2013-11-26 Thread Xiang, Haihao

On Tue, 2013-11-26 at 20:35 +0100, Daniel Vetter wrote: 
 Hi Brad,
 
 On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote:
  From: Brad Volkin bradley.d.vol...@intel.com
  
  Certain OpenGL features (e.g. transform feedback, performance monitoring)
  require userspace code to submit batches containing commands such as
  MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
  generations of the hardware will noop these commands in unsecure batches
  (which includes all userspace batches submitted via i915) even though the
  commands may be safe and represent the intended programming model of the 
  device.
  
  This series introduces a software command parser similar in operation to the
  command parsing done in hardware for unsecure batches. However, the software
  parser allows some operations that would be noop'd by hardware, if the 
  parser
  determines the operation is safe, and submits the batch as secure to 
  prevent
  hardware parsing. Currently the series implements this on IVB and HSW.
  
  The series is divided into several phases:
  
  patches 01-09: These implement infrastructure and the command parsing 
  algorithm,
 all behind a module parameter. I expect some discussion and
 rework, but hopefully there's nothing too controversial.
  patches 10-17: These define the checks performed by the parser.
 I expect much discussion :)
  patches 18-20: In a final pass over the command checks, I found some issues 
  with
 the definitions. They looked painful to rebase in, so I've 
  added
 them here.
  patches 21-22: These enable the parser by default. It runs on all batches 
  except
 those that set the I915_EXEC_SECURE flag in the execbuffer2 
  call.
 
 I think long-term we should even scan secure batches. We'd need to allow
 some registers which only the drm master (i.e. owner of the display
 hardware) is allowed to do, e.g. for scanline waits. But once we have that
 we should be able to port all current users of secure batches over to
 scanned batches and so enforce this everywhere by default.
 
 The other issue is that igt tests assume to be able to run some evil
 tests, so maybe we don't actually want this.
 
  There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very
  basic and do not test all of the commands used by the parser on the 
  assumption
  that I'm likely to make the same mistakes in both the parser and the test.
 
 Yeah, I agree that just checking whether commands all go through (or not)
 as expected adds very little value on top of the few tests you have done.
 I think we should take a look at some corner cases which might trip up
 your checker a bit though:
 - I think we should check batchbuffer chaining and make sure it works on
   the vcs ring and not anywhere else (we can't ever break shipping libva
   which uses this).

Besides the vcs ring, we also use batchbuffer chaining on the render
ring for video post processing, video motion estimation and motion
compensation(on ILK),  A fixed length batch buffer isn't suitable for
those operations as those operations are based on a macroblock instead
of a frame. It would be better to make sure batchbuffer chaining works
on the render ring too.


 - Some tests to trip up your parser should be done, like 3D commands that
   fall off the end of the batch bo. Or commands that span page boundaries.
   The later isn't an issue atm since you use vmap, but we should switch to
   per-page kmap since the vmap overhead is fairly horrible.
 
  I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a 
  few
  days ago), and generally used an Ubuntu 13.10 IVB system with the parser
  running. Aside from a failure described below, I don't think there are any
  regressions. That is, piglit claims some regressions, but from manually 
  running
  the tests I think these are false positives. However, I could use help in
  getting broader testing, particularly around performance. In general, I see 
  less
  than 3% performance impact on HSW, with more like 10% impact for 
  pathological
  batch sizes. But we'll certainly want to run relevant benchmarks beyond what
  I've done.
 
 Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes
 through the checker and bypassing it (with EXEC_SECURE) would be really
 good. Maybe even some variable-sized commands (all the state setup stuff
 should be useful for that) to keep things interesting. Some variation is
 also important to have some good cache thrasing going on (since your check
 tables are fairly large I think).
 
  At this point there are a couple of known issues and potential improvements.
  
  1) VLV. The parser is currently disabled for VLV. One type of check 
  performed by
 the parser is that commands which access memory do so via PPGTT. VLV 
  does not
 have PPGTT enabled at this time. I chose to implement the

Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

2013-11-26 Thread ykzhao

On Tue, 2013-11-26 at 13:24 -0700, Volkin, Bradley D wrote:
 On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote:
  Hi Brad,
  
  On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote:
   From: Brad Volkin bradley.d.vol...@intel.com
   
   Certain OpenGL features (e.g. transform feedback, performance monitoring)
   require userspace code to submit batches containing commands such as
   MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
   generations of the hardware will noop these commands in unsecure batches
   (which includes all userspace batches submitted via i915) even though the
   commands may be safe and represent the intended programming model of the 
   device.
   
   This series introduces a software command parser similar in operation to 
   the
   command parsing done in hardware for unsecure batches. However, the 
   software
   parser allows some operations that would be noop'd by hardware, if the 
   parser
   determines the operation is safe, and submits the batch as secure to 
   prevent
   hardware parsing. Currently the series implements this on IVB and HSW.
   
   The series is divided into several phases:
   
   patches 01-09: These implement infrastructure and the command parsing 
   algorithm,
  all behind a module parameter. I expect some discussion and
rework, but hopefully there's nothing too controversial.
   patches 10-17: These define the checks performed by the parser.
  I expect much discussion :)
   patches 18-20: In a final pass over the command checks, I found some 
   issues with
  the definitions. They looked painful to rebase in, so I've 
   added
them here.
   patches 21-22: These enable the parser by default. It runs on all batches 
   except
  those that set the I915_EXEC_SECURE flag in the 
   execbuffer2 call.
  
  I think long-term we should even scan secure batches. We'd need to allow
  some registers which only the drm master (i.e. owner of the display
  hardware) is allowed to do, e.g. for scanline waits. But once we have that
  we should be able to port all current users of secure batches over to
  scanned batches and so enforce this everywhere by default.
  
  The other issue is that igt tests assume to be able to run some evil
  tests, so maybe we don't actually want this.
 
 Agreed. I thought we could handle this as a follow-up task once the basic 
 stuff is
 in place, particularly given that we'd want to modify at least some users to 
 test.
 I also wasn't sure if we would want the check to be root  master, as in the 
 current
 secure flag, or just master.
 
 W.r.t. the tests, I suppose we can just turn checking on for secure batches 
 and see
 what happens.
 
  
   There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are 
   very
   basic and do not test all of the commands used by the parser on the 
   assumption
   that I'm likely to make the same mistakes in both the parser and the test.
  
  Yeah, I agree that just checking whether commands all go through (or not)
  as expected adds very little value on top of the few tests you have done.
  I think we should take a look at some corner cases which might trip up
  your checker a bit though:
  - I think we should check batchbuffer chaining and make sure it works on
the vcs ring and not anywhere else (we can't ever break shipping libva
which uses this).
  - Some tests to trip up your parser should be done, like 3D commands that
fall off the end of the batch bo. Or commands that span page boundaries.
The later isn't an issue atm since you use vmap, but we should switch to
per-page kmap since the vmap overhead is fairly horrible.
 
 Good suggestions. I'll look into these.
Hi, Brad
  More inputs from libva about the batchbuffer chaining.

  Now the batchbuffer chaining is widely used in libva driver. This
is related with how the libva driver processes the image. For the
encoding purpose, it needs to be handled based on macroblock(16x16).And
every macroblock needs a group of GPU commands. So the GPU commands for
all the macroblocks will be constructed in the second-level batchbuffer.
The mode of batchbuffer chaining will bring the following benefits:
  a. The size of second-level batch buffer can be allocated based on
the size of handled image. For example: 1080p/720p/480p can use the
different size.
  b. The gpu commands in second-level batchbuffer can be constructed
by using GPU instead of CPU, which is helpful to improve the
performance. 

  At the same time both VCS and Render Ring are used in libva
driver. For example: The encoding will use VCS and RCS ring. Firstly the
RCS ring is used to execute GPU command for the motion vector/mode
prediction. And then the VCS Ring is used to execute the GPU command for
generating the bit-stream. So not only VCS ring uses the mode of
batchbuffer chaining, but also the Render

Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
 If we end up calling the shrinker, which in turn requires the OOM
 killer, we may end up infinitely waiting for a process to die if the OOM
 chooses. The case that this prevents occurs in execbuf. The forked
 variants of gem_evict_everything is a good way to hit it. This is
 exacerbated by Daniel's recent patch to give OOM precedence to the GEM
 tests.
 
 It's a twisted form of a deadlock.
 
 What occurs is the following (assume just 2 procs)
 1. proc A gets to execbuf while out of memory, gets struct_mutex.
 2. OOM killer comes in and chooses proc B
 3. proc B closes it's fds, which requires struct mutex, blocks
 4, OOM killer waits for B to die before killing another process (this
 part is speculative)
 
 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Chris Wilson ch...@chris-wilson.co.uk
 Signed-off-by: Ben Widawsky b...@bwidawsk.net

I'd still like to know if I am crazy, but I'm now trying to defer the
stuff we do on file close without using any allocs. Just an update...

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Daniel Vetter

On Wed, Nov 27, 2013 at 5:23 AM, Ben Widawsky b...@bwidawsk.net wrote:
 On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
 If we end up calling the shrinker, which in turn requires the OOM
 killer, we may end up infinitely waiting for a process to die if the OOM
 chooses. The case that this prevents occurs in execbuf. The forked
 variants of gem_evict_everything is a good way to hit it. This is
 exacerbated by Daniel's recent patch to give OOM precedence to the GEM
 tests.

 It's a twisted form of a deadlock.

 What occurs is the following (assume just 2 procs)
 1. proc A gets to execbuf while out of memory, gets struct_mutex.
 2. OOM killer comes in and chooses proc B
 3. proc B closes it's fds, which requires struct mutex, blocks
 4, OOM killer waits for B to die before killing another process (this
 part is speculative)

 Cc: Daniel Vetter daniel.vet...@ffwll.ch
 Cc: Chris Wilson ch...@chris-wilson.co.uk
 Signed-off-by: Ben Widawsky b...@bwidawsk.net

 I'd still like to know if I am crazy, but I'm now trying to defer the
 stuff we do on file close without using any allocs. Just an update...

Sound's intrigueing, but tbh I don't really have clue about things.
What about adding the relevant stuck task backtraces to the patch and
submitting this to a wider audience (lkml, mm-devel) as an akpm-probe?
The more botched the patch, the better the probe usually.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Allocate atomically in execbuf path

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 08:23:46PM -0800, Ben Widawsky wrote:
 On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote:
  If we end up calling the shrinker, which in turn requires the OOM
  killer, we may end up infinitely waiting for a process to die if the OOM
  chooses. The case that this prevents occurs in execbuf. The forked
  variants of gem_evict_everything is a good way to hit it. This is
  exacerbated by Daniel's recent patch to give OOM precedence to the GEM
  tests.
  
  It's a twisted form of a deadlock.
  
  What occurs is the following (assume just 2 procs)
  1. proc A gets to execbuf while out of memory, gets struct_mutex.
  2. OOM killer comes in and chooses proc B
  3. proc B closes it's fds, which requires struct mutex, blocks
  4, OOM killer waits for B to die before killing another process (this
  part is speculative)
  
  Cc: Daniel Vetter daniel.vet...@ffwll.ch
  Cc: Chris Wilson ch...@chris-wilson.co.uk
  Signed-off-by: Ben Widawsky b...@bwidawsk.net
 
 I'd still like to know if I am crazy, but I'm now trying to defer the
 stuff we do on file close without using any allocs. Just an update...
 

workqueue still has similar problems. It could be because deferring the
context cleanup means we don't actually free much space, and so the OOM
isn't enough, or [more likely] something else is going on.

Maybe it's my bug. I am really out of ideas at the moment. The system
just becomes unresponsive after all threads end up blocked waiting for
struct mutex. I know I'd seen such problems in the past with
gem_evict_everything, but this time around I seem to be the sole cause
(and not all my machines hit it).

Sorry for the noise - just totally burning out on this one.

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/2] drm/i915/vlv: Make the vlv_dpio_read/vlv_dpio_write more PHY centric

2013-11-26 Thread Daniel Vetter

On Wed, Nov 27, 2013 at 12:18 AM, Jesse Barnes jbar...@virtuousgeek.org wrote:
 On Tue, 26 Nov 2013 15:10:22 -0800
 Jesse Barnes jbar...@virtuousgeek.org wrote:

 On Mon, 11 Nov 2013 10:23:24 +0100
 Daniel Vetter dan...@ffwll.ch wrote:

  On Wed, Nov 06, 2013 at 12:51:05PM +0200, Ville Syrjälä wrote:
   On Wed, Nov 06, 2013 at 02:36:35PM +0800, Chon Ming Lee wrote:
vlv_dpio_read/write should be describe more in PHY centric instead of
display controller centric.
Create a enum dpio_channel for channel index and enum dpio_phy for PHY
index.  This should better to gather for upcoming platform.
   
v2: Rebase the code based on
drm/i915/vlv: Fix typo in the DPIO register define.
   
v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro
DPIO_PHY, and remove unrelated change. (Ville)
   
Suggested-by: Ville Syrjälä ville.syrj...@linux.intel.com
Signed-off-by: Chon Ming Lee chon.ming@intel.com
  
   Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com
 
  Queued for -next, thanks for the patch.

 Looks like this one gives me bogus DPIO values at least some of the
 time.  Reverting to using 0x12 as the port ID seems to get me valid
 values back...

 Ah looks like the init_dpio happens too late for the mode state
 readout.  I'll post a patch to move it up.

Isn't that just because your fb reconstruction patches moves it up by
a lot? If so can you please extract just that from your patches? I was
wondering whether we should do that due to the usual init ordering fun
anyway. I'd prefer to let that one soak a few days in dinq or so
before pulling in the hairy stuff.

Aside: The s/intelfbdev.ifb/intelfbdev-fb/ conversion would also be
neater split out.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Optionally defer context closing

2013-11-26 Thread Ben Widawsky

This patch doesn't seem to be the panacea that I had hoped, although
I've yet to thoroughly test if it actually improves some of the other
tests which caused trouble.

What occurs is the following (assume just 2 procs)
1. proc A gets to execbuf while out of memory, gets struct_mutex.
2. OOM killer comes in and chooses proc B
3. proc B closes it's fds, which requires struct mutex, blocks
4, OOM killer waits for B to die before killing another process (this
part is speculative)

In order to let the OOM complete, we defer processing the context
destruction parts which are those that require struct_mutex.

This patch has not really been cleaned up, I am only posting it for
posterity. (Several of the hunks are only relevant to full PPGTT patches
which are not merged).

Signed-off-by: Ben Widawsky b...@bwidawsk.net
---
 drivers/gpu/drm/i915/i915_debugfs.c |  6 ++-
 drivers/gpu/drm/i915/i915_drv.h |  9 +++--
 drivers/gpu/drm/i915/i915_gem.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_context.c | 68 -
 4 files changed, 61 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 835a43e..e6b5f3e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1717,11 +1717,13 @@ static void gen6_ppgtt_info(struct seq_file *m, struct 
drm_device *dev)
struct drm_i915_file_private *file_priv = file-driver_priv;
struct i915_hw_ppgtt *pvt_ppgtt;
 
-   pvt_ppgtt = ctx_to_ppgtt(file_priv-private_default_ctx);
+   pvt_ppgtt =
+   ctx_to_ppgtt(file_priv-ctx_info-private_default_ctx);
seq_printf(m, proc: %s\n,
   get_pid_task(file-pid, PIDTYPE_PID)-comm);
seq_puts(m,   default context:\n);
-   idr_for_each(file_priv-context_idr, per_file_ctx, m);
+   idr_for_each(file_priv-ctx_info-context_idr, per_file_ctx,
+m);
}
seq_printf(m, ECOCHK: 0x%08x\n, I915_READ(GAM_ECOCHK));
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6fdab40..931fc42 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1735,15 +1735,18 @@ struct drm_i915_gem_request {
 
 struct drm_i915_file_private {
struct drm_i915_private *dev_priv;
-
struct {
spinlock_t lock;
struct list_head request_list;
struct delayed_work idle_work;
} mm;
-   struct idr context_idr;
+   struct i915_gem_ctx_info {
+   struct drm_i915_private *dev_priv;
+   struct idr context_idr;
+   struct work_struct close_work;
+   struct i915_hw_context *private_default_ctx;
+   } *ctx_info;
 
-   struct i915_hw_context *private_default_ctx;
atomic_t rps_wait_boost;
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6312d61..a32f6df 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2326,7 +2326,7 @@ static void i915_set_reset_status(struct 
intel_ring_buffer *ring,
if (request-ctx  request-ctx-id != DEFAULT_CONTEXT_ID)
hs = request-ctx-hang_stats;
else if (request-file_priv)
-   hs = request-file_priv-private_default_ctx-hang_stats;
+   hs = 
request-file_priv-ctx_info-private_default_ctx-hang_stats;
 
if (hs) {
if (guilty) {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index f55f0a9..770b394 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -209,8 +209,8 @@ __create_hw_context(struct drm_device *dev,
if (file_priv == NULL)
return ctx;
 
-   ret = idr_alloc(file_priv-context_idr, ctx, DEFAULT_CONTEXT_ID, 0,
-   GFP_KERNEL);
+   ret = idr_alloc(file_priv-ctx_info-context_idr,
+   ctx, DEFAULT_CONTEXT_ID, 0, GFP_KERNEL);
if (ret  0)
goto err_out;
 
@@ -481,29 +481,54 @@ static int context_idr_cleanup(int id, void *p, void 
*data)
return 0;
 }
 
+static void i915_gem_context_close_work_handler(struct work_struct *work)
+{
+   struct i915_gem_ctx_info *ctx_info = container_of(work,
+ struct 
i915_gem_ctx_info,
+ close_work);
+   struct drm_i915_private *dev_priv = ctx_info-dev_priv;
+
+   mutex_lock(dev_priv-dev-struct_mutex);
+   idr_for_each(ctx_info-context_idr, context_idr_cleanup, NULL);
+   i915_gem_context_unreference(ctx_info-private_default_ctx);
+   idr_destroy(ctx_info-context_idr);
+   mutex_unlock(dev_priv-dev-struct_mutex);
+
+

Re: [Intel-gfx] [PATCH] drm/i915: Pin relocations for the duration of constructing the execbuffer

2013-11-26 Thread Ben Widawsky

On Tue, Nov 26, 2013 at 11:23:15AM +, Chris Wilson wrote:
 As the execbuffer dispatch grows ever more complex and involves multiple
 stages of moving objects into the aperture, we need to take greater care
 that we do not evict our execbuffer objects prior to dispatch. This is
 relatively simple as we can just keep the objects pinned for not just
 the relocation but until we are finished.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Ben Widawsky benjamin.widaw...@intel.com
 Cc: Daniel Vetter dan...@ffwll.ch
 Cc: sta...@vger.kernel.org

To be honest, I don't quite see the actual issue, but the problem
description, and solution make sense to me. I've also been running with
the patch quite a bit on HSW.

Acked-by: Ben Widawsky b...@bwidawsk.net
Tested-by: Ben Widawsky b...@bwidawsk.net

 ---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 60 
 --
  1 file changed, 32 insertions(+), 28 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
 b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 index 885d595e0e02..b7e787fb4649 100644
 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
 @@ -33,6 +33,9 @@
  #include intel_drv.h
  #include linux/dma_remapping.h
  
 +#define  __EXEC_OBJECT_HAS_PIN (131)
 +#define  __EXEC_OBJECT_HAS_FENCE (130)
 +
  struct eb_vmas {
   struct list_head vmas;
   int and;
 @@ -187,7 +190,28 @@ static struct i915_vma *eb_get_vma(struct eb_vmas *eb, 
 unsigned long handle)
   }
  }
  
 -static void eb_destroy(struct eb_vmas *eb) {
 +static void
 +i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
 +{
 + struct drm_i915_gem_exec_object2 *entry;
 + struct drm_i915_gem_object *obj = vma-obj;
 +
 + if (!drm_mm_node_allocated(vma-node))
 + return;
 +
 + entry = vma-exec_entry;
 +
 + if (entry-flags  __EXEC_OBJECT_HAS_FENCE)
 + i915_gem_object_unpin_fence(obj);
 +
 + if (entry-flags  __EXEC_OBJECT_HAS_PIN)
 + i915_gem_object_unpin(obj);
 +
 + entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
 +}
 +
 +static void eb_destroy(struct eb_vmas *eb)
 +{
   while (!list_empty(eb-vmas)) {
   struct i915_vma *vma;
  
 @@ -195,6 +219,7 @@ static void eb_destroy(struct eb_vmas *eb) {
  struct i915_vma,
  exec_list);
   list_del_init(vma-exec_list);
 + i915_gem_execbuffer_unreserve_vma(vma);
   drm_gem_object_unreference(vma-obj-base);
   }
   kfree(eb);
 @@ -478,9 +503,6 @@ i915_gem_execbuffer_relocate(struct eb_vmas *eb,
   return ret;
  }
  
 -#define  __EXEC_OBJECT_HAS_PIN (131)
 -#define  __EXEC_OBJECT_HAS_FENCE (130)
 -
  static int
  need_reloc_mappable(struct i915_vma *vma)
  {
 @@ -552,26 +574,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
   return 0;
  }
  
 -static void
 -i915_gem_execbuffer_unreserve_vma(struct i915_vma *vma)
 -{
 - struct drm_i915_gem_exec_object2 *entry;
 - struct drm_i915_gem_object *obj = vma-obj;
 -
 - if (!drm_mm_node_allocated(vma-node))
 - return;
 -
 - entry = vma-exec_entry;
 -
 - if (entry-flags  __EXEC_OBJECT_HAS_FENCE)
 - i915_gem_object_unpin_fence(obj);
 -
 - if (entry-flags  __EXEC_OBJECT_HAS_PIN)
 - i915_gem_object_unpin(obj);
 -
 - entry-flags = ~(__EXEC_OBJECT_HAS_FENCE | __EXEC_OBJECT_HAS_PIN);
 -}
 -
  static int
  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
   struct list_head *vmas,
 @@ -670,13 +672,14 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer 
 *ring,
   goto err;
   }
  
 -err: /* Decrement pin count for bound objects */
 - list_for_each_entry(vma, vmas, exec_list)
 - i915_gem_execbuffer_unreserve_vma(vma);
 -
 +err:
   if (ret != -ENOSPC || retry++)
   return ret;
  
 + /* Decrement pin count for bound objects */
 + list_for_each_entry(vma, vmas, exec_list)
 + i915_gem_execbuffer_unreserve_vma(vma);
 +
   ret = i915_gem_evict_vm(vm, true);
   if (ret)
   return ret;
 @@ -708,6 +711,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
   while (!list_empty(eb-vmas)) {
   vma = list_first_entry(eb-vmas, struct i915_vma, exec_list);
   list_del_init(vma-exec_list);
 + i915_gem_execbuffer_unreserve_vma(vma);
   drm_gem_object_unreference(vma-obj-base);
   }
  
 -- 
 1.8.4.4
 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

89 matches

Mail list logo