[Intel-gfx] [PATCH] drm/i915: error_buffer-ring should be signed

2012-02-16 Thread Daniel Vetter
gcc seems to get uber-anal recently about these things.

Reported-by: Dan Carpenter dan.carpen...@oracle.com
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_drv.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b839728..35833fc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -199,7 +199,7 @@ struct drm_i915_error_state {
u32 tiling:2;
u32 dirty:1;
u32 purgeable:1;
-   u32 ring:4;
+   s32 ring:4;
u32 cache_level:2;
} *active_bo, *pinned_bo;
u32 active_bo_count, pinned_bo_count;
-- 
1.7.9

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: error_buffer-ring should be signed

2012-02-16 Thread Paul Menzel
Am Donnerstag, den 16.02.2012, 11:03 +0100 schrieb Daniel Vetter:
 gcc seems to get uber-anal recently about these things.

which was introduced by the following commit.

96154f2faba5: drm/i915: switch ring-id to be a real id

 Reported-by: Dan Carpenter dan.carpen...@oracle.com

The URL of the report is the following.

http://lists.freedesktop.org/archives/dri-devel/2012-February/019183.html

 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch

Acked-by: Paul Menzel paulepan...@users.sourceforge.net

 ---
  drivers/gpu/drm/i915/i915_drv.h |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index b839728..35833fc 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -199,7 +199,7 @@ struct drm_i915_error_state {
   u32 tiling:2;
   u32 dirty:1;
   u32 purgeable:1;
 - u32 ring:4;
 + s32 ring:4;
   u32 cache_level:2;
   } *active_bo, *pinned_bo;
   u32 active_bo_count, pinned_bo_count;


Thanks,

Paul


signature.asc
Description: This is a digitally signed message part
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/11] drm/i915/context: ringbuffer context switch code

2012-02-16 Thread Ben Widawsky
On Wed, 15 Feb 2012 12:11:58 -0800
Eric Anholt e...@anholt.net wrote:

 On Tue, 14 Feb 2012 22:09:13 +0100, Ben Widawsky b...@bwidawsk.net wrote:
  This is the HW dependent context switch code.
  
  Signed-off-by: Ben Widawsky b...@bwidawsk.net
  ---
   drivers/gpu/drm/i915/i915_drv.h |3 +
   drivers/gpu/drm/i915/intel_ringbuffer.c |  117 
  +++
   drivers/gpu/drm/i915/intel_ringbuffer.h |6 ++-
   3 files changed, 125 insertions(+), 1 deletions(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_drv.h 
  b/drivers/gpu/drm/i915/i915_drv.h
  index 34e6f4f..4175929 100644
  --- a/drivers/gpu/drm/i915/i915_drv.h
  +++ b/drivers/gpu/drm/i915/i915_drv.h
  @@ -965,6 +965,9 @@ struct drm_i915_gem_context {
  bool is_initialized;
   };
   
  +#define I915_CONTEXT_NORMAL_SWITCH (1  0)
  +#define I915_CONTEXT_SAVE_ONLY (1  1)
  +#define I915_CONTEXT_FORCED_SWITCH (1  2)
   #define INTEL_INFO(dev)(((struct drm_i915_private *) 
  (dev)-dev_private)-info)
   
   #define IS_I830(dev)   ((dev)-pci_device == 0x3577)
  diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
  b/drivers/gpu/drm/i915/intel_ringbuffer.c
  index e71e7fc..dcdc80e 100644
  --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
  +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
  @@ -942,6 +942,122 @@ render_ring_dispatch_execbuffer(struct 
  intel_ring_buffer *ring,
  return 0;
   }
   
  +static int do_ring_switch(struct intel_ring_buffer *ring,
  + struct drm_i915_gem_context *new_context,
  + u32 hw_flags)
 
 Can we call this function do_mi_set_context()?  It doesn't look like it
 has to do with switching rings.

Sure.

 
  +{
  +   struct drm_device *dev = ring-dev;
  +   int ret = 0;
  +
  +   if (!new_context-is_initialized) {
  +   ret = ring-flush(ring, 0, 0);
  +   if (ret)
  +   return ret;
  +
  +   ret = intel_ring_begin(ring, 2);
  +   if (ret)
  +   return ret;
  +
  +   intel_ring_emit(ring, MI_NOOP | (1  22) | new_context-id);
  +   intel_ring_emit(ring, MI_NOOP);
  +   intel_ring_advance(ring);
  +   }
 
 Not sure what this block is doing, nor have the docs enlightened me.
 Comment?

This incantation came from a document which I can no longer find. I'll
try to remove it and see if anything breaks. It's likely this was from
an old ILK doc if you're curious enough to look (mine doesn't appear old
enough)

 
  +   if (IS_GEN6(dev)  new_context-is_initialized 
  +   ring-itlb_before_ctx_switch) {
  +   /* w/a: If Flush TLB Invalidation Mode is enabled, driver must
  +* do a TLB invalidation prior to MI_SET_CONTEXT
  +*/
  +   gen6_render_ring_flush(ring, 0, 0);
  +   }
  +
  +   ret = intel_ring_begin(ring, 6);
  +   if (ret)
  +   return ret;
  +
  +   intel_ring_emit(ring, MI_NOOP);
  +
  +   switch (INTEL_INFO(dev)-gen) {
  +   case 5:
  +   intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
  +   break;
  +   case 6:
  +   intel_ring_emit(ring, MI_NOOP);
  +   break;
  +   case 7:
  +   intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE);
  +   break;
  +   case 4:
  +   default:
  +   BUG();
  +   }
 
 I can't see what this MI_ARB_ON_OFF is about.  We don't use run lists,
 so preemption can't occur, right?  And if it's needed on gen7, why isn't
 it needed on the previous chipsets, where the command apparently exists
 as well?

Just following the docs to the letter on the ARB_ON_OFF thing. We may
need the TLB flush on gen7 as well as gen6. I choose the BSPEC
workaround list as the ultimate guide. As best I can tell, the logic is
correct according to that.

 
 Also, MI_SUSPEND_FLUSH?  (also exists on all chispets) It Blocks MMIO
 sync flush or any flushes related to VT-d while enabled.  We don't use
 sync flushes, and presumably if we do VT-d related flushes, we still
 want them to work, right?  Why do hardware render contexts change that?

I can't find this one either anymore. This I very clearly recall as a
required workaround for ILK. Since I'm not exposing this currently for
less than GEN6 this is a don't care. My current ILK docs are not the
same as the ones I used when I wrote this. If you look at
ironlake_enable_rc6() you can also see this workaround used.

 
  +   intel_ring_emit(ring, MI_SET_CONTEXT);
  +   intel_ring_emit(ring, new_context-obj-gtt_offset |
  +   MI_MM_SPACE_GTT |
  +   MI_SAVE_EXT_STATE_EN |
  +   MI_RESTORE_EXT_STATE_EN |
  +   hw_flags);
 
 
  +static struct drm_i915_gem_context *
  +render_ring_context_switch(struct intel_ring_buffer *ring,
  +  struct drm_i915_gem_context *new_context,
  +  u32 flags)
  +{
  +   struct drm_device *dev = ring-dev;
  +   bool force = (flags  

[Intel-gfx] i915 SNB: hotplug events when charging causing poor interactivity

2012-02-16 Thread Daniel J Blueman
Keith, Eric,

When charging Dell E5420 laptops, I see the Sandy Bridge South Display
Engine port C hotplug interrupt fire consistently at around 20Hz. Each
scan on the connectors results in ~100ms hold time for the mode_config
mutex which blocks eg the cursor set ioctl (due to the I2C read
timeouts across the connectors), resulting in terrible GUI
interactivity.

Notably, when only using battery or only using AC, the port C
interrupts don't fire. It feels like the platform is using South
Display Port C for GPIO/I2C; setting the port C pulse duration from
2ms to 100ms doesn't change the behaviour. I'll dump off the GPIO
settings, but what else is good to debug this?

Also, have you come across this kind of pattern before, eg a platform
using these GPIO ports for the something (in this case, feels like the
EC/battery)? Judging by the lack of a quirks, I'd say not. Perhaps
also I should dump off the SDE port C IIR mask register before we
reset it, in case the BIOS intentionally masks out port C hotplug
events.

Thanks,
 Daniel
--
Daniel J Blueman
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-16 Thread Daniel Vetter
drm/i915 wants to read/write more than one page in its fastpath
and hence needs to prefault more than PAGE_SIZE bytes.

I've checked the callsites and they all already clamp size when
calling fault_in_pages_* to the same as for the subsequent
__copy_to|from_user and hence don't rely on the implicit clamping
to PAGE_SIZE.

Also kill a copypasted spurious space in both functions while at it.

Cc: linux...@kvack.org
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 include/linux/pagemap.h |   28 ++--
 1 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index cfaaa69..689527d 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
wait_queue_t *waiter);
 static inline int fault_in_pages_writeable(char __user *uaddr, int size)
 {
int ret;
+   char __user *end = uaddr + size - 1;
 
if (unlikely(size == 0))
return 0;
@@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user 
*uaddr, int size)
 * Writing zeroes into userspace here is OK, because we know that if
 * the zero gets there, we'll be overwriting it.
 */
-   ret = __put_user(0, uaddr);
+   while (uaddr = end) {
+   ret = __put_user(0, uaddr);
+   if (ret != 0)
+   return ret;
+   uaddr += PAGE_SIZE;
+   }
if (ret == 0) {
-   char __user *end = uaddr + size - 1;
-
/*
 * If the page was already mapped, this will get a cache miss
 * for sure, so try to avoid doing it.
 */
-   if (((unsigned long)uaddr  PAGE_MASK) !=
+   if (((unsigned long)uaddr  PAGE_MASK) ==
((unsigned long)end  PAGE_MASK))
-   ret = __put_user(0, end);
+   ret = __put_user(0, end);
}
return ret;
 }
@@ -435,17 +439,21 @@ static inline int fault_in_pages_readable(const char 
__user *uaddr, int size)
 {
volatile char c;
int ret;
+   const char __user *end = uaddr + size - 1;
 
if (unlikely(size == 0))
return 0;
 
-   ret = __get_user(c, uaddr);
+   while (uaddr = end) {
+   ret = __get_user(c, uaddr);
+   if (ret != 0)
+   return ret;
+   uaddr += PAGE_SIZE;
+   }
if (ret == 0) {
-   const char __user *end = uaddr + size - 1;
-
-   if (((unsigned long)uaddr  PAGE_MASK) !=
+   if (((unsigned long)uaddr  PAGE_MASK) ==
((unsigned long)end  PAGE_MASK)) {
-   ret = __get_user(c, end);
+   ret = __get_user(c, end);
(void)c;
}
}
-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support

2012-02-16 Thread Ben Widawsky
On Wed, 15 Feb 2012 12:33:38 -0800
Eric Anholt e...@anholt.net wrote:

 On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net wrote:
  These patches are a heavily revised version of the patches I wrote over
  a year ago. These patches have passed basic tests on SNB, and IVB, and
  older versions worked on ILK.  In theory, context support should work
  all the way back to Gen4, but I haven't tested it. Also since I suspect
  ILK may be unstable, so the code has it disabled for now.
  
  HW contexts provide a way for the GPU to save an restore certain state
  in between batchbuffer boundaries. Typically, GPU clients must re-emit
  the entire state every time they run because the client does not know
  what has been destroyed since the last time. With these patches the
  driver will emit special instructions to do this on behalf of the client
  if it has registered a context, and included that with the batchbuffer.
 
 These patches look pretty solid.  In particular, the API
 (create/destroy/context id in execbuf) looks like just what we want for
 Mesa.  I'll try to get around to testing it out soon (I'm poking at some
 performance stuff currently where this might become relevant soon).

I've just started noticing GPU hangs with Ken's test mesa branch on
nexuiz with vsync, full 13x7 and max effects.  It seems to work fine 
with variations like windowed, lower detail, etc. Although it looks
weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen 
the hangs before this morning, and I'm not sure what has changed. So 
FYI, you may want to start out with IVB (unless you want to help me 
figure out what is broken on SNB :-)

I've not tried very hard, but so far it only seems to occur when doing
context switches, however MI_SET_CONTEXT is nowhere in the error state.

 
 The couple of patches without a comment from me are:
 
 Reviewed-by: Eric Anholt e...@anholt.net
 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 00/14] pwrite/pread rework/retuning

2012-02-16 Thread Daniel Vetter
Hi all,

So here we go with the scary patches ;-)

I've been beating in this up and down and we also now have a fairly
nice set of i-g-t tests to check corner-cases of the clflushing and
similar. The mm prefault helper patch is included here for context,
I've submitted that for inclusion through -mm.

Review, comments, flames highly welcome.

Cheers, Daniel

Daniel Vetter (14):
  drm/i915: merge shmem_pwrite slowfast-path
  drm/i915: merge shmem_pread slowfast-path
  drm: add helper to clflush a virtual address range
  drm/i915: move clflushing into shmem_pread
  drm/i915: kill ranged cpu read domain support
  drm/i915: don't use gtt_pwrite on LLC cached objects
  drm/i915: don't call shmem_read_mapping unnecessarily
  mm: extend prefault helpers to fault in more than PAGE_SIZE
  drm/i915: drop gtt slowpath
  drm/i915: don't clobber userspace memory before commiting to the
pread
  drm/i915: implement inline clflush for pwrite
  drm/i915: fall back to shmem pwrite when the buffer is not accessible
  drm/i915: use uncached writes in pwrite
  drm/i915: extract copy helpers from shmem_pread|pwrite

 drivers/gpu/drm/drm_cache.c |   23 ++
 drivers/gpu/drm/i915/i915_drv.h |7 -
 drivers/gpu/drm/i915/i915_gem.c |  731 ++-
 include/drm/drmP.h  |1 +
 include/linux/pagemap.h |   28 +-
 5 files changed, 304 insertions(+), 486 deletions(-)

-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/14] drm/i915: merge shmem_pwrite slowfast-path

2012-02-16 Thread Daniel Vetter
With the previous rewrite, they've become essential identical.

v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris
Wilson.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |  126 ++
 1 files changed, 33 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 19a06c2..535630c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -711,84 +711,11 @@ out_unpin_pages:
return ret;
 }
 
-/**
- * This is the fast shmem pwrite path, which attempts to directly
- * copy_from_user into the kmapped pages backing the object.
- */
-static int
-i915_gem_shmem_pwrite_fast(struct drm_device *dev,
-  struct drm_i915_gem_object *obj,
-  struct drm_i915_gem_pwrite *args,
-  struct drm_file *file)
-{
-   struct address_space *mapping = 
obj-base.filp-f_path.dentry-d_inode-i_mapping;
-   ssize_t remain;
-   loff_t offset;
-   char __user *user_data;
-   int page_offset, page_length;
-
-   user_data = (char __user *) (uintptr_t) args-data_ptr;
-   remain = args-size;
-
-   offset = args-offset;
-   obj-dirty = 1;
-
-   while (remain  0) {
-   struct page *page;
-   char *vaddr;
-   int ret;
-
-   /* Operation in this page
-*
-* page_offset = offset within page
-* page_length = bytes to copy for this page
-*/
-   page_offset = offset_in_page(offset);
-   page_length = remain;
-   if ((page_offset + remain)  PAGE_SIZE)
-   page_length = PAGE_SIZE - page_offset;
-
-   page = shmem_read_mapping_page(mapping, offset  PAGE_SHIFT);
-   if (IS_ERR(page))
-   return PTR_ERR(page);
-
-   vaddr = kmap_atomic(page);
-   ret = __copy_from_user_inatomic(vaddr + page_offset,
-   user_data,
-   page_length);
-   kunmap_atomic(vaddr);
-
-   set_page_dirty(page);
-   mark_page_accessed(page);
-   page_cache_release(page);
-
-   /* If we get a fault while copying data, then (presumably) our
-* source page isn't available.  Return the error and we'll
-* retry in the slow path.
-*/
-   if (ret)
-   return -EFAULT;
-
-   remain -= page_length;
-   user_data += page_length;
-   offset += page_length;
-   }
-
-   return 0;
-}
-
-/**
- * This is the fallback shmem pwrite path, which uses get_user_pages to pin
- * the memory and maps it using kmap_atomic for copying.
- *
- * This avoids taking mmap_sem for faulting on the user's address while the
- * struct_mutex is held.
- */
 static int
-i915_gem_shmem_pwrite_slow(struct drm_device *dev,
-  struct drm_i915_gem_object *obj,
-  struct drm_i915_gem_pwrite *args,
-  struct drm_file *file)
+i915_gem_shmem_pwrite(struct drm_device *dev,
+ struct drm_i915_gem_object *obj,
+ struct drm_i915_gem_pwrite *args,
+ struct drm_file *file)
 {
struct address_space *mapping = 
obj-base.filp-f_path.dentry-d_inode-i_mapping;
ssize_t remain;
@@ -796,6 +723,7 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev,
char __user *user_data;
int shmem_page_offset, page_length, ret;
int obj_do_bit17_swizzling, page_do_bit17_swizzling;
+   int hit_slowpath = 0;
 
user_data = (char __user *) (uintptr_t) args-data_ptr;
remain = args-size;
@@ -805,8 +733,6 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev,
offset = args-offset;
obj-dirty = 1;
 
-   mutex_unlock(dev-struct_mutex);
-
while (remain  0) {
struct page *page;
char *vaddr;
@@ -831,6 +757,21 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev,
page_do_bit17_swizzling = obj_do_bit17_swizzling 
(page_to_phys(page)  (1  17)) != 0;
 
+   if (!page_do_bit17_swizzling) {
+   vaddr = kmap_atomic(page);
+   ret = __copy_from_user_inatomic(vaddr + 
shmem_page_offset,
+   user_data,
+   page_length);
+   kunmap_atomic(vaddr);
+
+   if (ret == 0)
+   goto next_page;
+   }
+
+   hit_slowpath = 1;
+
+   mutex_unlock(dev-struct_mutex);
+
  

[Intel-gfx] [PATCH 02/14] drm/i915: merge shmem_pread slowfast-path

2012-02-16 Thread Daniel Vetter
With the previous rewrite, they've become essential identical.

v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris
Wilson.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |  108 ++-
 1 files changed, 27 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 535630c..faff00b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -259,66 +259,6 @@ static int i915_gem_object_needs_bit17_swizzle(struct 
drm_i915_gem_object *obj)
obj-tiling_mode != I915_TILING_NONE;
 }
 
-/**
- * This is the fast shmem pread path, which attempts to copy_from_user directly
- * from the backing pages of the object to the user's address space.  On a
- * fault, it fails so we can fall back to i915_gem_shmem_pwrite_slow().
- */
-static int
-i915_gem_shmem_pread_fast(struct drm_device *dev,
- struct drm_i915_gem_object *obj,
- struct drm_i915_gem_pread *args,
- struct drm_file *file)
-{
-   struct address_space *mapping = 
obj-base.filp-f_path.dentry-d_inode-i_mapping;
-   ssize_t remain;
-   loff_t offset;
-   char __user *user_data;
-   int page_offset, page_length;
-
-   user_data = (char __user *) (uintptr_t) args-data_ptr;
-   remain = args-size;
-
-   offset = args-offset;
-
-   while (remain  0) {
-   struct page *page;
-   char *vaddr;
-   int ret;
-
-   /* Operation in this page
-*
-* page_offset = offset within page
-* page_length = bytes to copy for this page
-*/
-   page_offset = offset_in_page(offset);
-   page_length = remain;
-   if ((page_offset + remain)  PAGE_SIZE)
-   page_length = PAGE_SIZE - page_offset;
-
-   page = shmem_read_mapping_page(mapping, offset  PAGE_SHIFT);
-   if (IS_ERR(page))
-   return PTR_ERR(page);
-
-   vaddr = kmap_atomic(page);
-   ret = __copy_to_user_inatomic(user_data,
- vaddr + page_offset,
- page_length);
-   kunmap_atomic(vaddr);
-
-   mark_page_accessed(page);
-   page_cache_release(page);
-   if (ret)
-   return -EFAULT;
-
-   remain -= page_length;
-   user_data += page_length;
-   offset += page_length;
-   }
-
-   return 0;
-}
-
 static inline int
 __copy_to_user_swizzled(char __user *cpu_vaddr,
const char *gpu_vaddr, int gpu_offset,
@@ -371,17 +311,11 @@ __copy_from_user_swizzled(char __user *gpu_vaddr, int 
gpu_offset,
return 0;
 }
 
-/**
- * This is the fallback shmem pread path, which allocates temporary storage
- * in kernel space to copy_to_user into outside of the struct_mutex, so we
- * can copy out of the object's backing pages while holding the struct mutex
- * and not take page faults.
- */
 static int
-i915_gem_shmem_pread_slow(struct drm_device *dev,
- struct drm_i915_gem_object *obj,
- struct drm_i915_gem_pread *args,
- struct drm_file *file)
+i915_gem_shmem_pread(struct drm_device *dev,
+struct drm_i915_gem_object *obj,
+struct drm_i915_gem_pread *args,
+struct drm_file *file)
 {
struct address_space *mapping = 
obj-base.filp-f_path.dentry-d_inode-i_mapping;
char __user *user_data;
@@ -389,6 +323,7 @@ i915_gem_shmem_pread_slow(struct drm_device *dev,
loff_t offset;
int shmem_page_offset, page_length, ret;
int obj_do_bit17_swizzling, page_do_bit17_swizzling;
+   int hit_slowpath = 0;
 
user_data = (char __user *) (uintptr_t) args-data_ptr;
remain = args-size;
@@ -397,8 +332,6 @@ i915_gem_shmem_pread_slow(struct drm_device *dev,
 
offset = args-offset;
 
-   mutex_unlock(dev-struct_mutex);
-
while (remain  0) {
struct page *page;
char *vaddr;
@@ -422,6 +355,20 @@ i915_gem_shmem_pread_slow(struct drm_device *dev,
page_do_bit17_swizzling = obj_do_bit17_swizzling 
(page_to_phys(page)  (1  17)) != 0;
 
+   if (!page_do_bit17_swizzling) {
+   vaddr = kmap_atomic(page);
+   ret = __copy_to_user_inatomic(user_data,
+ vaddr + shmem_page_offset,
+ page_length);
+   kunmap_atomic(vaddr);
+   if (ret == 0) 
+   

[Intel-gfx] [PATCH 03/14] drm: add helper to clflush a virtual address range

2012-02-16 Thread Daniel Vetter
Useful when the page is already mapped to copy date in/out.

For -stable because the next patch (fixing phys obj pwrite) needs this
little helper function.

Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/drm_cache.c |   23 +++
 include/drm/drmP.h  |1 +
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 5928653..c7c8f6b 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -98,3 +98,26 @@ drm_clflush_pages(struct page *pages[], unsigned long 
num_pages)
 #endif
 }
 EXPORT_SYMBOL(drm_clflush_pages);
+
+void
+drm_clflush_virt_range(char *addr, unsigned long length)
+{
+#if defined(CONFIG_X86)
+   if (cpu_has_clflush) {
+   char *end = addr + length;
+   mb();
+   for (; addr  end; addr += boot_cpu_data.x86_clflush_size)
+   clflush(addr);
+   clflush(end - 1);
+   mb();
+   return;
+   }
+
+   if (on_each_cpu(drm_clflush_ipi_handler, NULL, 1) != 0)
+   printk(KERN_ERR Timed out waiting for cache flush.\n);
+#else
+   printk(KERN_ERR Architecture has no drm_cache.c support\n);
+   WARN_ON_ONCE(1);
+#endif
+}
+EXPORT_SYMBOL(drm_clflush_virt_range);
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 92f0981..d33597b 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1332,6 +1332,7 @@ extern int drm_remove_magic(struct drm_master *master, 
drm_magic_t magic);
 
 /* Cache management (drm_cache.c) */
 void drm_clflush_pages(struct page *pages[], unsigned long num_pages);
+void drm_clflush_virt_range(char *addr, unsigned long length);
 
/* Locking IOCTL support (drm_lock.h) */
 extern int drm_lock(struct drm_device *dev, void *data,
-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support

2012-02-16 Thread Daniel Vetter
No longer needed.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_drv.h |7 --
 drivers/gpu/drm/i915/i915_gem.c |  117 ---
 2 files changed, 0 insertions(+), 124 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b839728..46f9382 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -911,13 +911,6 @@ struct drm_i915_gem_object {
/** Record of address bit 17 of each page at last unbind. */
unsigned long *bit_17;
 
-
-   /**
-* If present, while GEM_DOMAIN_CPU is in the read domain this array
-* flags which individual pages are valid.
-*/
-   uint8_t *page_cpu_valid;
-
/** User space pin count and filp owning the pin */
uint32_t user_pin_count;
struct drm_file *pin_filp;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c9a8098..d23d324 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -41,10 +41,6 @@ static void i915_gem_object_flush_gtt_write_domain(struct 
drm_i915_gem_object *o
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object 
*obj);
 static __must_check int i915_gem_object_set_to_cpu_domain(struct 
drm_i915_gem_object *obj,
  bool write);
-static __must_check int i915_gem_object_set_cpu_read_domain_range(struct 
drm_i915_gem_object *obj,
- uint64_t 
offset,
- uint64_t 
size);
-static void i915_gem_object_set_to_full_cpu_read_domain(struct 
drm_i915_gem_object *obj);
 static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object 
*obj,
unsigned alignment,
bool map_and_fenceable);
@@ -3000,11 +2996,6 @@ i915_gem_object_set_to_cpu_domain(struct 
drm_i915_gem_object *obj, bool write)
 
i915_gem_object_flush_gtt_write_domain(obj);
 
-   /* If we have a partially-valid cache of the object in the CPU,
-* finish invalidating it and free the per-page flags.
-*/
-   i915_gem_object_set_to_full_cpu_read_domain(obj);
-
old_write_domain = obj-base.write_domain;
old_read_domains = obj-base.read_domains;
 
@@ -3035,113 +3026,6 @@ i915_gem_object_set_to_cpu_domain(struct 
drm_i915_gem_object *obj, bool write)
return 0;
 }
 
-/**
- * Moves the object from a partially CPU read to a full one.
- *
- * Note that this only resolves i915_gem_object_set_cpu_read_domain_range(),
- * and doesn't handle transitioning from !(read_domains  I915_GEM_DOMAIN_CPU).
- */
-static void
-i915_gem_object_set_to_full_cpu_read_domain(struct drm_i915_gem_object *obj)
-{
-   if (!obj-page_cpu_valid)
-   return;
-
-   /* If we're partially in the CPU read domain, finish moving it in.
-*/
-   if (obj-base.read_domains  I915_GEM_DOMAIN_CPU) {
-   int i;
-
-   for (i = 0; i = (obj-base.size - 1) / PAGE_SIZE; i++) {
-   if (obj-page_cpu_valid[i])
-   continue;
-   drm_clflush_pages(obj-pages + i, 1);
-   }
-   }
-
-   /* Free the page_cpu_valid mappings which are now stale, whether
-* or not we've got I915_GEM_DOMAIN_CPU.
-*/
-   kfree(obj-page_cpu_valid);
-   obj-page_cpu_valid = NULL;
-}
-
-/**
- * Set the CPU read domain on a range of the object.
- *
- * The object ends up with I915_GEM_DOMAIN_CPU in its read flags although it's
- * not entirely valid.  The page_cpu_valid member of the object flags which
- * pages have been flushed, and will be respected by
- * i915_gem_object_set_to_cpu_domain() if it's called on to get a valid mapping
- * of the whole object.
- *
- * This function returns when the move is complete, including waiting on
- * flushes to occur.
- */
-static int
-i915_gem_object_set_cpu_read_domain_range(struct drm_i915_gem_object *obj,
- uint64_t offset, uint64_t size)
-{
-   uint32_t old_read_domains;
-   int i, ret;
-
-   if (offset == 0  size == obj-base.size)
-   return i915_gem_object_set_to_cpu_domain(obj, 0);
-
-   ret = i915_gem_object_flush_gpu_write_domain(obj);
-   if (ret)
-   return ret;
-
-   ret = i915_gem_object_wait_rendering(obj);
-   if (ret)
-   return ret;
-
-   i915_gem_object_flush_gtt_write_domain(obj);
-
-   /* If we're already fully in the CPU read domain, we're done. */
-   if (obj-page_cpu_valid == NULL 
-   (obj-base.read_domains  I915_GEM_DOMAIN_CPU) != 0)
-   return 0;
-
-   /* Otherwise, create/clear the per-page CPU read 

[Intel-gfx] [PATCH 06/14] drm/i915: don't use gtt_pwrite on LLC cached objects

2012-02-16 Thread Daniel Vetter
~120 µs instead fo ~210 µs to write 1mb on my snb. I like this.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d23d324..0446c4c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -828,6 +828,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
}
 
if (obj-gtt_space 
+   obj-cache_level == I915_CACHE_NONE 
obj-base.write_domain != I915_GEM_DOMAIN_CPU) {
ret = i915_gem_object_pin(obj, 0, true);
if (ret)
-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 07/14] drm/i915: don't call shmem_read_mapping unnecessarily

2012-02-16 Thread Daniel Vetter
This speeds up pwrite and pread from ~120 µs ro ~100 µs for
reading/writing 1mb on my snb (if the backing storage pages
are already pinned, of course).

v2: Chris Wilson pointed out a claring page reference bug - I've
unconditionally dropped the reference. With that fixed (and the
associated reduction of dirt in dmesg) it's now even a notch faster.

v3: Unconditionaly grab a page reference when dropping
dev-struct_mutex to simplify the code-flow.

Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |   42 +++---
 1 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0446c4c..0487889 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -321,6 +321,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
int obj_do_bit17_swizzling, page_do_bit17_swizzling;
int hit_slowpath = 0;
int needs_clflush = 0;
+   int release_page;
 
user_data = (char __user *) (uintptr_t) args-data_ptr;
remain = args-size;
@@ -355,10 +356,16 @@ i915_gem_shmem_pread(struct drm_device *dev,
if ((shmem_page_offset + page_length)  PAGE_SIZE)
page_length = PAGE_SIZE - shmem_page_offset;
 
-   page = shmem_read_mapping_page(mapping, offset  PAGE_SHIFT);
-   if (IS_ERR(page)) {
-   ret = PTR_ERR(page);
-   goto out;
+   if (obj-pages) {
+   page = obj-pages[offset  PAGE_SHIFT];
+   release_page = 0;
+   } else {
+   page = shmem_read_mapping_page(mapping, offset  
PAGE_SHIFT);
+   if (IS_ERR(page)) {
+   ret = PTR_ERR(page);
+   goto out;
+   }
+   release_page = 1;
}
 
page_do_bit17_swizzling = obj_do_bit17_swizzling 
@@ -378,7 +385,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
}
 
hit_slowpath = 1;
-
+   page_cache_get(page);
mutex_unlock(dev-struct_mutex);
 
vaddr = kmap(page);
@@ -397,9 +404,11 @@ i915_gem_shmem_pread(struct drm_device *dev,
kunmap(page);
 
mutex_lock(dev-struct_mutex);
+   page_cache_release(page);
 next_page:
mark_page_accessed(page);
-   page_cache_release(page);
+   if (release_page)
+   page_cache_release(page);
 
if (ret) {
ret = -EFAULT;
@@ -680,6 +689,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
int shmem_page_offset, page_length, ret;
int obj_do_bit17_swizzling, page_do_bit17_swizzling;
int hit_slowpath = 0;
+   int release_page;
 
user_data = (char __user *) (uintptr_t) args-data_ptr;
remain = args-size;
@@ -704,10 +714,16 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
if ((shmem_page_offset + page_length)  PAGE_SIZE)
page_length = PAGE_SIZE - shmem_page_offset;
 
-   page = shmem_read_mapping_page(mapping, offset  PAGE_SHIFT);
-   if (IS_ERR(page)) {
-   ret = PTR_ERR(page);
-   goto out;
+   if (obj-pages) {
+   page = obj-pages[offset  PAGE_SHIFT];
+   release_page = 0;
+   } else {
+   page = shmem_read_mapping_page(mapping, offset  
PAGE_SHIFT);
+   if (IS_ERR(page)) {
+   ret = PTR_ERR(page);
+   goto out;
+   }
+   release_page = 1;
}
 
page_do_bit17_swizzling = obj_do_bit17_swizzling 
@@ -725,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
}
 
hit_slowpath = 1;
-
+   page_cache_get(page);
mutex_unlock(dev-struct_mutex);
 
vaddr = kmap(page);
@@ -740,10 +756,12 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
kunmap(page);
 
mutex_lock(dev-struct_mutex);
+   page_cache_release(page);
 next_page:
set_page_dirty(page);
mark_page_accessed(page);
-   page_cache_release(page);
+   if (release_page)
+   page_cache_release(page);
 
if (ret) {
ret = -EFAULT;
-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 13/14] drm/i915: use uncached writes in pwrite

2012-02-16 Thread Daniel Vetter
It's around 20% faster.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 48bef0b..9f49421 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -659,9 +659,9 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
if (partial_cacheline_write)
drm_clflush_virt_range(vaddr + 
shmem_page_offset,
   page_length);
-   ret = __copy_from_user_inatomic(vaddr + 
shmem_page_offset,
-   user_data,
-   page_length);
+   ret = __copy_from_user_inatomic_nocache(vaddr + 
shmem_page_offset,
+   user_data,
+   page_length);
if (needs_clflush)
drm_clflush_virt_range(vaddr + 
shmem_page_offset,
   page_length);
-- 
1.7.7.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 14/14] drm/i915: extract copy helpers from shmem_pread|pwrite

2012-02-16 Thread Daniel Vetter
While moving around things, this two functions slowly grew out of any
sane bounds. So extract a few lines that do the copying and
clflushing. Also add a few comments to explain what's going on.

Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem.c |  192 +++
 1 files changed, 132 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9f49421..0328cb3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -307,6 +307,60 @@ __copy_from_user_swizzled(char __user *gpu_vaddr, int 
gpu_offset,
return 0;
 }
 
+/* Per-page copy function for the shmem pread fastpath.
+ * Flushes invalid cachelines before reading the target if
+ * needs_clflush is set. */
+static int
+shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length,
+char __user *user_data,
+bool page_do_bit17_swizzling, bool needs_clflush)
+{
+   char *vaddr;
+   int ret;
+
+   if (page_do_bit17_swizzling)
+   return -EINVAL;
+
+   vaddr = kmap_atomic(page);
+   if (needs_clflush)
+   drm_clflush_virt_range(vaddr + shmem_page_offset,
+  page_length);
+   ret = __copy_to_user_inatomic(user_data,
+ vaddr + shmem_page_offset,
+ page_length);
+   kunmap_atomic(vaddr);
+
+   return ret;
+}
+
+/* Only difference to the fast-path function is that this can handle bit17
+ * and uses non-atomic copy and kmap functions. */
+static int
+shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
+char __user *user_data,
+bool page_do_bit17_swizzling, bool needs_clflush)
+{
+   char *vaddr;
+   int ret;
+
+   vaddr = kmap(page);
+   if (needs_clflush)
+   drm_clflush_virt_range(vaddr + shmem_page_offset,
+  page_length);
+
+   if (page_do_bit17_swizzling)
+   ret = __copy_to_user_swizzled(user_data,
+ vaddr, shmem_page_offset,
+ page_length);
+   else
+   ret = __copy_to_user(user_data,
+vaddr + shmem_page_offset,
+page_length);
+   kunmap(page);
+
+   return ret;
+}
+
 static int
 i915_gem_shmem_pread(struct drm_device *dev,
 struct drm_i915_gem_object *obj,
@@ -345,7 +399,6 @@ i915_gem_shmem_pread(struct drm_device *dev,
 
while (remain  0) {
struct page *page;
-   char *vaddr;
 
/* Operation in this page
 *
@@ -372,18 +425,11 @@ i915_gem_shmem_pread(struct drm_device *dev,
page_do_bit17_swizzling = obj_do_bit17_swizzling 
(page_to_phys(page)  (1  17)) != 0;
 
-   if (!page_do_bit17_swizzling) {
-   vaddr = kmap_atomic(page);
-   if (needs_clflush)
-   drm_clflush_virt_range(vaddr + 
shmem_page_offset,
-  page_length);
-   ret = __copy_to_user_inatomic(user_data,
- vaddr + shmem_page_offset,
- page_length);
-   kunmap_atomic(vaddr);
-   if (ret == 0) 
-   goto next_page;
-   }
+   ret = shmem_pread_fast(page, shmem_page_offset, page_length,
+  user_data, page_do_bit17_swizzling,
+  needs_clflush);
+   if (ret == 0)
+   goto next_page;
 
hit_slowpath = 1;
page_cache_get(page);
@@ -399,20 +445,9 @@ i915_gem_shmem_pread(struct drm_device *dev,
prefaulted = 1;
}
 
-   vaddr = kmap(page);
-   if (needs_clflush)
-   drm_clflush_virt_range(vaddr + shmem_page_offset,
-  page_length);
-
-   if (page_do_bit17_swizzling)
-   ret = __copy_to_user_swizzled(user_data,
- vaddr, shmem_page_offset,
- page_length);
-   else
-   ret = __copy_to_user(user_data,
-vaddr + shmem_page_offset,
-page_length);
-   kunmap(page);
+   ret = shmem_pread_slow(page, shmem_page_offset, page_length,
+   

Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support

2012-02-16 Thread Ben Widawsky
On Thu, 16 Feb 2012 13:04:14 +0100
Ben Widawsky b...@bwidawsk.net wrote:

 On Wed, 15 Feb 2012 12:33:38 -0800
 Eric Anholt e...@anholt.net wrote:
 
  On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net wrote:
   These patches are a heavily revised version of the patches I wrote over
   a year ago. These patches have passed basic tests on SNB, and IVB, and
   older versions worked on ILK.  In theory, context support should work
   all the way back to Gen4, but I haven't tested it. Also since I suspect
   ILK may be unstable, so the code has it disabled for now.
   
   HW contexts provide a way for the GPU to save an restore certain state
   in between batchbuffer boundaries. Typically, GPU clients must re-emit
   the entire state every time they run because the client does not know
   what has been destroyed since the last time. With these patches the
   driver will emit special instructions to do this on behalf of the client
   if it has registered a context, and included that with the batchbuffer.
  
  These patches look pretty solid.  In particular, the API
  (create/destroy/context id in execbuf) looks like just what we want for
  Mesa.  I'll try to get around to testing it out soon (I'm poking at some
  performance stuff currently where this might become relevant soon).
 
 I've just started noticing GPU hangs with Ken's test mesa branch on
 nexuiz with vsync, full 13x7 and max effects.  It seems to work fine 
 with variations like windowed, lower detail, etc. Although it looks
 weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen 
 the hangs before this morning, and I'm not sure what has changed. So 
 FYI, you may want to start out with IVB (unless you want to help me 
 figure out what is broken on SNB :-)
 
 I've not tried very hard, but so far it only seems to occur when doing
 context switches, however MI_SET_CONTEXT is nowhere in the error state.

Seems like sandybridge + fullscreen nexuiz is the exact fail combo.

 
  
  The couple of patches without a comment from me are:
  
  Reviewed-by: Eric Anholt e...@anholt.net
  
 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 SNB: hotplug events when charging causing poor interactivity

2012-02-16 Thread Daniel Vetter
On Thu, Feb 16, 2012 at 11:44:46AM +, Daniel J Blueman wrote:
 When charging Dell E5420 laptops, I see the Sandy Bridge South Display
 Engine port C hotplug interrupt fire consistently at around 20Hz. Each
 scan on the connectors results in ~100ms hold time for the mode_config
 mutex which blocks eg the cursor set ioctl (due to the I2C read
 timeouts across the connectors), resulting in terrible GUI
 interactivity.

We know about this locking issue - it's much worse on platforms where we
need to do load-detect hotplug detection or if for whatever reasons it
takes ages to grab the edid from your screen. The Great Plan (tm) is to
add a per-crtc mutex so that cursor updates and pageflips can continue
while someone else is holding the config_mutex to do background stuff like
hotplug handling. Unfortunately there's tons of other important stuff on
my todo :(

 Notably, when only using battery or only using AC, the port C
 interrupts don't fire. It feels like the platform is using South
 Display Port C for GPIO/I2C; setting the port C pulse duration from
 2ms to 100ms doesn't change the behaviour. I'll dump off the GPIO
 settings, but what else is good to debug this?

Yep, dumping the register state (intel_reg_dumper from intel-gpu-tools is
handy for that) in the different situations sounds useful.

 Also, have you come across this kind of pattern before, eg a platform
 using these GPIO ports for the something (in this case, feels like the
 EC/battery)? Judging by the lack of a quirks, I'd say not. Perhaps
 also I should dump off the SDE port C IIR mask register before we
 reset it, in case the BIOS intentionally masks out port C hotplug
 events.

If this is indeed the bios we need to quirk this away. But I think we
should check first whether we don't butcher something else accidently.
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support

2012-02-16 Thread Ben Widawsky
On Thu, 16 Feb 2012 13:21:42 +0100
Ben Widawsky b...@bwidawsk.net wrote:

 On Thu, 16 Feb 2012 13:04:14 +0100
 Ben Widawsky b...@bwidawsk.net wrote:
 
  On Wed, 15 Feb 2012 12:33:38 -0800
  Eric Anholt e...@anholt.net wrote:
  
   On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net 
   wrote:
These patches are a heavily revised version of the patches I wrote over
a year ago. These patches have passed basic tests on SNB, and IVB, and
older versions worked on ILK.  In theory, context support should work
all the way back to Gen4, but I haven't tested it. Also since I suspect
ILK may be unstable, so the code has it disabled for now.

HW contexts provide a way for the GPU to save an restore certain state
in between batchbuffer boundaries. Typically, GPU clients must re-emit
the entire state every time they run because the client does not know
what has been destroyed since the last time. With these patches the
driver will emit special instructions to do this on behalf of the client
if it has registered a context, and included that with the batchbuffer.
   
   These patches look pretty solid.  In particular, the API
   (create/destroy/context id in execbuf) looks like just what we want for
   Mesa.  I'll try to get around to testing it out soon (I'm poking at some
   performance stuff currently where this might become relevant soon).
  
  I've just started noticing GPU hangs with Ken's test mesa branch on
  nexuiz with vsync, full 13x7 and max effects.  It seems to work fine 
  with variations like windowed, lower detail, etc. Although it looks
  weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen 
  the hangs before this morning, and I'm not sure what has changed. So 
  FYI, you may want to start out with IVB (unless you want to help me 
  figure out what is broken on SNB :-)
  
  I've not tried very hard, but so far it only seems to occur when doing
  context switches, however MI_SET_CONTEXT is nowhere in the error state.
 
 Seems like sandybridge + fullscreen nexuiz is the exact fail combo.

So here was part of Ken's patch
-   brw-state.dirty.brw |= BRW_NEW_CONTEXT | BRW_NEW_BATCH;
+   if (intel-hw_ctx == NULL)
+  brw-state.dirty.brw |= BRW_NEW_CONTEXT;
+
+   brw-state.dirty.brw |= BRW_NEW_BATCH;


If I go back to normal and use contexts, but also set BRW_NEW_CONTEXT, I
don't get hangs. So it sounds like some part of the state isn't being
saved or restored properly. Still trying to figure out what changed to
make it break.

As a side note, IVB has weird artifacts which I thought were just
because I was using an experimental mesa, but now think it's likely the
same cause. Maybe you can figure out what isn't be saved or restored
based on seeing IVB?

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-16 Thread Daniel Vetter
On Thu, Feb 16, 2012 at 09:32:08PM +0800, Hillf Danton wrote:
 On Thu, Feb 16, 2012 at 8:01 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote:
  @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char 
  __user *uaddr, int size)
          * Writing zeroes into userspace here is OK, because we know that if
          * the zero gets there, we'll be overwriting it.
          */
  -       ret = __put_user(0, uaddr);
  +       while (uaddr = end) {
  +               ret = __put_user(0, uaddr);
  +               if (ret != 0)
  +                       return ret;
  +               uaddr += PAGE_SIZE;
  +       }
 
 What if
  uaddr  ~PAGE_MASK == PAGE_SIZE -3 
 end  ~PAGE_MASK == 2

I don't quite follow - can you elaborate upon which issue you're seeing?
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] updated -next

2012-02-16 Thread Daniel Vetter
Hi all,

Updated -next and -testing trees. I haven't merged in any of the patches
Jesse queued up because he hasn't yet pushed out his latest -fixes tree.
No cookies for Jesse today!

Highlights:
- interlaced support for i915. Again thanks a lot to all the ppl who help
  out with testing, patches and doc-crawling.
- aliasing ppgtt support for snb/ivb. Because ppgtt ptes are
  gpu-cacheable, this can also speed things up a bit.
- swizzling support for snb/ivb, again a slight perf improvements on some
  things.
- more error_state work - we're slowly reaching a level of paranoia
  suitable for dealing with gpus.
- outstanding_lazy_request fix and the autoreport patches from Chris: I'm
  pretty hopefully that these two squash a lot of the semaphores=1 issues
  we've seen on snb, please retest if you've had issues.
- the usual pile of minor patches, one noteworthy one is to use the lvds
  presence ping on pch_split chips. I expect a few new quirks due to this
  ...

Go forth and test!

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support

2012-02-16 Thread Daniel Vetter
On Thu, Feb 16, 2012 at 08:48:07AM -0800, Eric Anholt wrote:
 On Thu, 16 Feb 2012 13:11:31 +0100, Daniel Vetter daniel.vet...@ffwll.ch 
 wrote:
  No longer needed.
 
 What this code was for: Before gtt mapping, we were doing software
 fallbacks in Mesa with pread/write on pages at a time (or worse) of the
 framebuffer.  It would regularly result in hitting the same page again,
 since I was only caching the last page I'd pulled out, instead of
 keeping a whole copy of the framebuffer during the fallback.

Urgh, that's not really efficient ;-) I think for s/w fallbacks and
readbacks we can presume decent damage tracking (like sna does) on the
userspace sides.

 Since we've been doing gtt mapping for years at this point, I'm happy to
 see the code die.
 
 I'm not sure about the rest of the code.  In particular, for the code
 that's switching between gtt and cpu mappings to handle a read/write,
 I'm concerned about whether the behavior matches for tiled objects.  I
 haven't reviewed enough to be sure.

Behaviour should match old code if you read/write entire pages (we should
have decent set of tests for that). If you do non-cacheline-aligned
reads/writes on tiled objects, we might hit an issue, but they should be
solveable (I've simply been too lazy to write testcases for this). Not
cacheline aligned reads/writes on untiled also work, I've created a set of
tests to exercise issues there (and tested the tests by omitting some of
the clflushes, i.e. all the clfushes now left _are_ required).

If old mesa depends on sub-page reads/writes to tiled objects I need to
create the respective tests and double-check the code, otherwise I think
we're covered. Do you want me to adapt the tests to check correctness for
sub-page reads/writes to tiled objects?

Thanks, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support

2012-02-16 Thread Chris Wilson
On Thu, 16 Feb 2012 18:38:08 +0100, Daniel Vetter dan...@ffwll.ch wrote:
 If old mesa depends on sub-page reads/writes to tiled objects I need to
 create the respective tests and double-check the code, otherwise I think
 we're covered. Do you want me to adapt the tests to check correctness for
 sub-page reads/writes to tiled objects?

I think that you've cast enough doubt in our minds that we would like to
see the corresponding sub-page tests. ;-)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] linux driver support GVA3650 ?

2012-02-16 Thread Jin, Gordon
 -Original Message-
 From: intel-gfx-bounces+gordon.jin=intel@lists.freedesktop.org
 [mailto:intel-gfx-bounces+gordon.jin=intel@lists.freedesktop.org] On
 Behalf Of Daniel Vetter
 Sent: Wednesday, February 15, 2012 8:54 PM
 To: Kevin Kuei
 Cc: intel-gfx@lists.freedesktop.org
 Subject: Re: [Intel-gfx] linux driver support GVA3650 ?
 
 On Wed, Feb 15, 2012 at 05:26:51PM +0800, Kevin Kuei wrote:
  Hello all,
 
  We are planing to use Intel D2700 CPU (Cedar Trail) as the solution of our
  product, but I can NOT find the linux graphic driver.
  I searched it in:
  http://intellinuxgraphics.org/documentation.html
  and the mailling list archieve.

It's not Intel GPU and not supported by the project here (including the website 
and mailing list)

  Is there anyone can kindly tell me where can I get the driver? or is Intel
  have the plan to develop the linux driver??

Cedar Trail driver was just released last week, into MeeGo 1.2 for Netbook: 
https://meego.com/downloads/releases/1.2/meego-v1.2-netbooks

It consists of a kernel driver and a (closed-source) user space driver.

 cedar trail contains a pvr chip as the gpu core and is very much not
 supported by the open source intel linux graphics team. You're on your
 own, essentially :(


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx