[Intel-gfx] [PATCH] drm/i915: error_buffer-ring should be signed
gcc seems to get uber-anal recently about these things. Reported-by: Dan Carpenter dan.carpen...@oracle.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_drv.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b839728..35833fc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -199,7 +199,7 @@ struct drm_i915_error_state { u32 tiling:2; u32 dirty:1; u32 purgeable:1; - u32 ring:4; + s32 ring:4; u32 cache_level:2; } *active_bo, *pinned_bo; u32 active_bo_count, pinned_bo_count; -- 1.7.9 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: error_buffer-ring should be signed
Am Donnerstag, den 16.02.2012, 11:03 +0100 schrieb Daniel Vetter: gcc seems to get uber-anal recently about these things. which was introduced by the following commit. 96154f2faba5: drm/i915: switch ring-id to be a real id Reported-by: Dan Carpenter dan.carpen...@oracle.com The URL of the report is the following. http://lists.freedesktop.org/archives/dri-devel/2012-February/019183.html Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Acked-by: Paul Menzel paulepan...@users.sourceforge.net --- drivers/gpu/drm/i915/i915_drv.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b839728..35833fc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -199,7 +199,7 @@ struct drm_i915_error_state { u32 tiling:2; u32 dirty:1; u32 purgeable:1; - u32 ring:4; + s32 ring:4; u32 cache_level:2; } *active_bo, *pinned_bo; u32 active_bo_count, pinned_bo_count; Thanks, Paul signature.asc Description: This is a digitally signed message part ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/11] drm/i915/context: ringbuffer context switch code
On Wed, 15 Feb 2012 12:11:58 -0800 Eric Anholt e...@anholt.net wrote: On Tue, 14 Feb 2012 22:09:13 +0100, Ben Widawsky b...@bwidawsk.net wrote: This is the HW dependent context switch code. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_drv.h |3 + drivers/gpu/drm/i915/intel_ringbuffer.c | 117 +++ drivers/gpu/drm/i915/intel_ringbuffer.h |6 ++- 3 files changed, 125 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 34e6f4f..4175929 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -965,6 +965,9 @@ struct drm_i915_gem_context { bool is_initialized; }; +#define I915_CONTEXT_NORMAL_SWITCH (1 0) +#define I915_CONTEXT_SAVE_ONLY (1 1) +#define I915_CONTEXT_FORCED_SWITCH (1 2) #define INTEL_INFO(dev)(((struct drm_i915_private *) (dev)-dev_private)-info) #define IS_I830(dev) ((dev)-pci_device == 0x3577) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e71e7fc..dcdc80e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -942,6 +942,122 @@ render_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, return 0; } +static int do_ring_switch(struct intel_ring_buffer *ring, + struct drm_i915_gem_context *new_context, + u32 hw_flags) Can we call this function do_mi_set_context()? It doesn't look like it has to do with switching rings. Sure. +{ + struct drm_device *dev = ring-dev; + int ret = 0; + + if (!new_context-is_initialized) { + ret = ring-flush(ring, 0, 0); + if (ret) + return ret; + + ret = intel_ring_begin(ring, 2); + if (ret) + return ret; + + intel_ring_emit(ring, MI_NOOP | (1 22) | new_context-id); + intel_ring_emit(ring, MI_NOOP); + intel_ring_advance(ring); + } Not sure what this block is doing, nor have the docs enlightened me. Comment? This incantation came from a document which I can no longer find. I'll try to remove it and see if anything breaks. It's likely this was from an old ILK doc if you're curious enough to look (mine doesn't appear old enough) + if (IS_GEN6(dev) new_context-is_initialized + ring-itlb_before_ctx_switch) { + /* w/a: If Flush TLB Invalidation Mode is enabled, driver must +* do a TLB invalidation prior to MI_SET_CONTEXT +*/ + gen6_render_ring_flush(ring, 0, 0); + } + + ret = intel_ring_begin(ring, 6); + if (ret) + return ret; + + intel_ring_emit(ring, MI_NOOP); + + switch (INTEL_INFO(dev)-gen) { + case 5: + intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN); + break; + case 6: + intel_ring_emit(ring, MI_NOOP); + break; + case 7: + intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_DISABLE); + break; + case 4: + default: + BUG(); + } I can't see what this MI_ARB_ON_OFF is about. We don't use run lists, so preemption can't occur, right? And if it's needed on gen7, why isn't it needed on the previous chipsets, where the command apparently exists as well? Just following the docs to the letter on the ARB_ON_OFF thing. We may need the TLB flush on gen7 as well as gen6. I choose the BSPEC workaround list as the ultimate guide. As best I can tell, the logic is correct according to that. Also, MI_SUSPEND_FLUSH? (also exists on all chispets) It Blocks MMIO sync flush or any flushes related to VT-d while enabled. We don't use sync flushes, and presumably if we do VT-d related flushes, we still want them to work, right? Why do hardware render contexts change that? I can't find this one either anymore. This I very clearly recall as a required workaround for ILK. Since I'm not exposing this currently for less than GEN6 this is a don't care. My current ILK docs are not the same as the ones I used when I wrote this. If you look at ironlake_enable_rc6() you can also see this workaround used. + intel_ring_emit(ring, MI_SET_CONTEXT); + intel_ring_emit(ring, new_context-obj-gtt_offset | + MI_MM_SPACE_GTT | + MI_SAVE_EXT_STATE_EN | + MI_RESTORE_EXT_STATE_EN | + hw_flags); +static struct drm_i915_gem_context * +render_ring_context_switch(struct intel_ring_buffer *ring, + struct drm_i915_gem_context *new_context, + u32 flags) +{ + struct drm_device *dev = ring-dev; + bool force = (flags
[Intel-gfx] i915 SNB: hotplug events when charging causing poor interactivity
Keith, Eric, When charging Dell E5420 laptops, I see the Sandy Bridge South Display Engine port C hotplug interrupt fire consistently at around 20Hz. Each scan on the connectors results in ~100ms hold time for the mode_config mutex which blocks eg the cursor set ioctl (due to the I2C read timeouts across the connectors), resulting in terrible GUI interactivity. Notably, when only using battery or only using AC, the port C interrupts don't fire. It feels like the platform is using South Display Port C for GPIO/I2C; setting the port C pulse duration from 2ms to 100ms doesn't change the behaviour. I'll dump off the GPIO settings, but what else is good to debug this? Also, have you come across this kind of pattern before, eg a platform using these GPIO ports for the something (in this case, feels like the EC/battery)? Judging by the lack of a quirks, I'd say not. Perhaps also I should dump off the SDE port C IIR mask register before we reset it, in case the BIOS intentionally masks out port C hotplug events. Thanks, Daniel -- Daniel J Blueman ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE. Also kill a copypasted spurious space in both functions while at it. Cc: linux...@kvack.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- include/linux/pagemap.h | 28 ++-- 1 files changed, 18 insertions(+), 10 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cfaaa69..689527d 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); static inline int fault_in_pages_writeable(char __user *uaddr, int size) { int ret; + char __user *end = uaddr + size - 1; if (unlikely(size == 0)) return 0; @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); + while (uaddr = end) { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } if (ret == 0) { - char __user *end = uaddr + size - 1; - /* * If the page was already mapped, this will get a cache miss * for sure, so try to avoid doing it. */ - if (((unsigned long)uaddr PAGE_MASK) != + if (((unsigned long)uaddr PAGE_MASK) == ((unsigned long)end PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; } @@ -435,17 +439,21 @@ static inline int fault_in_pages_readable(const char __user *uaddr, int size) { volatile char c; int ret; + const char __user *end = uaddr + size - 1; if (unlikely(size == 0)) return 0; - ret = __get_user(c, uaddr); + while (uaddr = end) { + ret = __get_user(c, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } if (ret == 0) { - const char __user *end = uaddr + size - 1; - - if (((unsigned long)uaddr PAGE_MASK) != + if (((unsigned long)uaddr PAGE_MASK) == ((unsigned long)end PAGE_MASK)) { - ret = __get_user(c, end); + ret = __get_user(c, end); (void)c; } } -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support
On Wed, 15 Feb 2012 12:33:38 -0800 Eric Anholt e...@anholt.net wrote: On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net wrote: These patches are a heavily revised version of the patches I wrote over a year ago. These patches have passed basic tests on SNB, and IVB, and older versions worked on ILK. In theory, context support should work all the way back to Gen4, but I haven't tested it. Also since I suspect ILK may be unstable, so the code has it disabled for now. HW contexts provide a way for the GPU to save an restore certain state in between batchbuffer boundaries. Typically, GPU clients must re-emit the entire state every time they run because the client does not know what has been destroyed since the last time. With these patches the driver will emit special instructions to do this on behalf of the client if it has registered a context, and included that with the batchbuffer. These patches look pretty solid. In particular, the API (create/destroy/context id in execbuf) looks like just what we want for Mesa. I'll try to get around to testing it out soon (I'm poking at some performance stuff currently where this might become relevant soon). I've just started noticing GPU hangs with Ken's test mesa branch on nexuiz with vsync, full 13x7 and max effects. It seems to work fine with variations like windowed, lower detail, etc. Although it looks weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen the hangs before this morning, and I'm not sure what has changed. So FYI, you may want to start out with IVB (unless you want to help me figure out what is broken on SNB :-) I've not tried very hard, but so far it only seems to occur when doing context switches, however MI_SET_CONTEXT is nowhere in the error state. The couple of patches without a comment from me are: Reviewed-by: Eric Anholt e...@anholt.net ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 00/14] pwrite/pread rework/retuning
Hi all, So here we go with the scary patches ;-) I've been beating in this up and down and we also now have a fairly nice set of i-g-t tests to check corner-cases of the clflushing and similar. The mm prefault helper patch is included here for context, I've submitted that for inclusion through -mm. Review, comments, flames highly welcome. Cheers, Daniel Daniel Vetter (14): drm/i915: merge shmem_pwrite slowfast-path drm/i915: merge shmem_pread slowfast-path drm: add helper to clflush a virtual address range drm/i915: move clflushing into shmem_pread drm/i915: kill ranged cpu read domain support drm/i915: don't use gtt_pwrite on LLC cached objects drm/i915: don't call shmem_read_mapping unnecessarily mm: extend prefault helpers to fault in more than PAGE_SIZE drm/i915: drop gtt slowpath drm/i915: don't clobber userspace memory before commiting to the pread drm/i915: implement inline clflush for pwrite drm/i915: fall back to shmem pwrite when the buffer is not accessible drm/i915: use uncached writes in pwrite drm/i915: extract copy helpers from shmem_pread|pwrite drivers/gpu/drm/drm_cache.c | 23 ++ drivers/gpu/drm/i915/i915_drv.h |7 - drivers/gpu/drm/i915/i915_gem.c | 731 ++- include/drm/drmP.h |1 + include/linux/pagemap.h | 28 +- 5 files changed, 304 insertions(+), 486 deletions(-) -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 01/14] drm/i915: merge shmem_pwrite slowfast-path
With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 126 ++ 1 files changed, 33 insertions(+), 93 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 19a06c2..535630c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -711,84 +711,11 @@ out_unpin_pages: return ret; } -/** - * This is the fast shmem pwrite path, which attempts to directly - * copy_from_user into the kmapped pages backing the object. - */ -static int -i915_gem_shmem_pwrite_fast(struct drm_device *dev, - struct drm_i915_gem_object *obj, - struct drm_i915_gem_pwrite *args, - struct drm_file *file) -{ - struct address_space *mapping = obj-base.filp-f_path.dentry-d_inode-i_mapping; - ssize_t remain; - loff_t offset; - char __user *user_data; - int page_offset, page_length; - - user_data = (char __user *) (uintptr_t) args-data_ptr; - remain = args-size; - - offset = args-offset; - obj-dirty = 1; - - while (remain 0) { - struct page *page; - char *vaddr; - int ret; - - /* Operation in this page -* -* page_offset = offset within page -* page_length = bytes to copy for this page -*/ - page_offset = offset_in_page(offset); - page_length = remain; - if ((page_offset + remain) PAGE_SIZE) - page_length = PAGE_SIZE - page_offset; - - page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); - if (IS_ERR(page)) - return PTR_ERR(page); - - vaddr = kmap_atomic(page); - ret = __copy_from_user_inatomic(vaddr + page_offset, - user_data, - page_length); - kunmap_atomic(vaddr); - - set_page_dirty(page); - mark_page_accessed(page); - page_cache_release(page); - - /* If we get a fault while copying data, then (presumably) our -* source page isn't available. Return the error and we'll -* retry in the slow path. -*/ - if (ret) - return -EFAULT; - - remain -= page_length; - user_data += page_length; - offset += page_length; - } - - return 0; -} - -/** - * This is the fallback shmem pwrite path, which uses get_user_pages to pin - * the memory and maps it using kmap_atomic for copying. - * - * This avoids taking mmap_sem for faulting on the user's address while the - * struct_mutex is held. - */ static int -i915_gem_shmem_pwrite_slow(struct drm_device *dev, - struct drm_i915_gem_object *obj, - struct drm_i915_gem_pwrite *args, - struct drm_file *file) +i915_gem_shmem_pwrite(struct drm_device *dev, + struct drm_i915_gem_object *obj, + struct drm_i915_gem_pwrite *args, + struct drm_file *file) { struct address_space *mapping = obj-base.filp-f_path.dentry-d_inode-i_mapping; ssize_t remain; @@ -796,6 +723,7 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev, char __user *user_data; int shmem_page_offset, page_length, ret; int obj_do_bit17_swizzling, page_do_bit17_swizzling; + int hit_slowpath = 0; user_data = (char __user *) (uintptr_t) args-data_ptr; remain = args-size; @@ -805,8 +733,6 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev, offset = args-offset; obj-dirty = 1; - mutex_unlock(dev-struct_mutex); - while (remain 0) { struct page *page; char *vaddr; @@ -831,6 +757,21 @@ i915_gem_shmem_pwrite_slow(struct drm_device *dev, page_do_bit17_swizzling = obj_do_bit17_swizzling (page_to_phys(page) (1 17)) != 0; + if (!page_do_bit17_swizzling) { + vaddr = kmap_atomic(page); + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, + user_data, + page_length); + kunmap_atomic(vaddr); + + if (ret == 0) + goto next_page; + } + + hit_slowpath = 1; + + mutex_unlock(dev-struct_mutex); +
[Intel-gfx] [PATCH 02/14] drm/i915: merge shmem_pread slowfast-path
With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 108 ++- 1 files changed, 27 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 535630c..faff00b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -259,66 +259,6 @@ static int i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_object *obj) obj-tiling_mode != I915_TILING_NONE; } -/** - * This is the fast shmem pread path, which attempts to copy_from_user directly - * from the backing pages of the object to the user's address space. On a - * fault, it fails so we can fall back to i915_gem_shmem_pwrite_slow(). - */ -static int -i915_gem_shmem_pread_fast(struct drm_device *dev, - struct drm_i915_gem_object *obj, - struct drm_i915_gem_pread *args, - struct drm_file *file) -{ - struct address_space *mapping = obj-base.filp-f_path.dentry-d_inode-i_mapping; - ssize_t remain; - loff_t offset; - char __user *user_data; - int page_offset, page_length; - - user_data = (char __user *) (uintptr_t) args-data_ptr; - remain = args-size; - - offset = args-offset; - - while (remain 0) { - struct page *page; - char *vaddr; - int ret; - - /* Operation in this page -* -* page_offset = offset within page -* page_length = bytes to copy for this page -*/ - page_offset = offset_in_page(offset); - page_length = remain; - if ((page_offset + remain) PAGE_SIZE) - page_length = PAGE_SIZE - page_offset; - - page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); - if (IS_ERR(page)) - return PTR_ERR(page); - - vaddr = kmap_atomic(page); - ret = __copy_to_user_inatomic(user_data, - vaddr + page_offset, - page_length); - kunmap_atomic(vaddr); - - mark_page_accessed(page); - page_cache_release(page); - if (ret) - return -EFAULT; - - remain -= page_length; - user_data += page_length; - offset += page_length; - } - - return 0; -} - static inline int __copy_to_user_swizzled(char __user *cpu_vaddr, const char *gpu_vaddr, int gpu_offset, @@ -371,17 +311,11 @@ __copy_from_user_swizzled(char __user *gpu_vaddr, int gpu_offset, return 0; } -/** - * This is the fallback shmem pread path, which allocates temporary storage - * in kernel space to copy_to_user into outside of the struct_mutex, so we - * can copy out of the object's backing pages while holding the struct mutex - * and not take page faults. - */ static int -i915_gem_shmem_pread_slow(struct drm_device *dev, - struct drm_i915_gem_object *obj, - struct drm_i915_gem_pread *args, - struct drm_file *file) +i915_gem_shmem_pread(struct drm_device *dev, +struct drm_i915_gem_object *obj, +struct drm_i915_gem_pread *args, +struct drm_file *file) { struct address_space *mapping = obj-base.filp-f_path.dentry-d_inode-i_mapping; char __user *user_data; @@ -389,6 +323,7 @@ i915_gem_shmem_pread_slow(struct drm_device *dev, loff_t offset; int shmem_page_offset, page_length, ret; int obj_do_bit17_swizzling, page_do_bit17_swizzling; + int hit_slowpath = 0; user_data = (char __user *) (uintptr_t) args-data_ptr; remain = args-size; @@ -397,8 +332,6 @@ i915_gem_shmem_pread_slow(struct drm_device *dev, offset = args-offset; - mutex_unlock(dev-struct_mutex); - while (remain 0) { struct page *page; char *vaddr; @@ -422,6 +355,20 @@ i915_gem_shmem_pread_slow(struct drm_device *dev, page_do_bit17_swizzling = obj_do_bit17_swizzling (page_to_phys(page) (1 17)) != 0; + if (!page_do_bit17_swizzling) { + vaddr = kmap_atomic(page); + ret = __copy_to_user_inatomic(user_data, + vaddr + shmem_page_offset, + page_length); + kunmap_atomic(vaddr); + if (ret == 0) +
[Intel-gfx] [PATCH 03/14] drm: add helper to clflush a virtual address range
Useful when the page is already mapped to copy date in/out. For -stable because the next patch (fixing phys obj pwrite) needs this little helper function. Cc: dri-de...@lists.freedesktop.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/drm_cache.c | 23 +++ include/drm/drmP.h |1 + 2 files changed, 24 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 5928653..c7c8f6b 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -98,3 +98,26 @@ drm_clflush_pages(struct page *pages[], unsigned long num_pages) #endif } EXPORT_SYMBOL(drm_clflush_pages); + +void +drm_clflush_virt_range(char *addr, unsigned long length) +{ +#if defined(CONFIG_X86) + if (cpu_has_clflush) { + char *end = addr + length; + mb(); + for (; addr end; addr += boot_cpu_data.x86_clflush_size) + clflush(addr); + clflush(end - 1); + mb(); + return; + } + + if (on_each_cpu(drm_clflush_ipi_handler, NULL, 1) != 0) + printk(KERN_ERR Timed out waiting for cache flush.\n); +#else + printk(KERN_ERR Architecture has no drm_cache.c support\n); + WARN_ON_ONCE(1); +#endif +} +EXPORT_SYMBOL(drm_clflush_virt_range); diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 92f0981..d33597b 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1332,6 +1332,7 @@ extern int drm_remove_magic(struct drm_master *master, drm_magic_t magic); /* Cache management (drm_cache.c) */ void drm_clflush_pages(struct page *pages[], unsigned long num_pages); +void drm_clflush_virt_range(char *addr, unsigned long length); /* Locking IOCTL support (drm_lock.h) */ extern int drm_lock(struct drm_device *dev, void *data, -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support
No longer needed. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_drv.h |7 -- drivers/gpu/drm/i915/i915_gem.c | 117 --- 2 files changed, 0 insertions(+), 124 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b839728..46f9382 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -911,13 +911,6 @@ struct drm_i915_gem_object { /** Record of address bit 17 of each page at last unbind. */ unsigned long *bit_17; - - /** -* If present, while GEM_DOMAIN_CPU is in the read domain this array -* flags which individual pages are valid. -*/ - uint8_t *page_cpu_valid; - /** User space pin count and filp owning the pin */ uint32_t user_pin_count; struct drm_file *pin_filp; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c9a8098..d23d324 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -41,10 +41,6 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); static __must_check int i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); -static __must_check int i915_gem_object_set_cpu_read_domain_range(struct drm_i915_gem_object *obj, - uint64_t offset, - uint64_t size); -static void i915_gem_object_set_to_full_cpu_read_domain(struct drm_i915_gem_object *obj); static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj, unsigned alignment, bool map_and_fenceable); @@ -3000,11 +2996,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_gtt_write_domain(obj); - /* If we have a partially-valid cache of the object in the CPU, -* finish invalidating it and free the per-page flags. -*/ - i915_gem_object_set_to_full_cpu_read_domain(obj); - old_write_domain = obj-base.write_domain; old_read_domains = obj-base.read_domains; @@ -3035,113 +3026,6 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write) return 0; } -/** - * Moves the object from a partially CPU read to a full one. - * - * Note that this only resolves i915_gem_object_set_cpu_read_domain_range(), - * and doesn't handle transitioning from !(read_domains I915_GEM_DOMAIN_CPU). - */ -static void -i915_gem_object_set_to_full_cpu_read_domain(struct drm_i915_gem_object *obj) -{ - if (!obj-page_cpu_valid) - return; - - /* If we're partially in the CPU read domain, finish moving it in. -*/ - if (obj-base.read_domains I915_GEM_DOMAIN_CPU) { - int i; - - for (i = 0; i = (obj-base.size - 1) / PAGE_SIZE; i++) { - if (obj-page_cpu_valid[i]) - continue; - drm_clflush_pages(obj-pages + i, 1); - } - } - - /* Free the page_cpu_valid mappings which are now stale, whether -* or not we've got I915_GEM_DOMAIN_CPU. -*/ - kfree(obj-page_cpu_valid); - obj-page_cpu_valid = NULL; -} - -/** - * Set the CPU read domain on a range of the object. - * - * The object ends up with I915_GEM_DOMAIN_CPU in its read flags although it's - * not entirely valid. The page_cpu_valid member of the object flags which - * pages have been flushed, and will be respected by - * i915_gem_object_set_to_cpu_domain() if it's called on to get a valid mapping - * of the whole object. - * - * This function returns when the move is complete, including waiting on - * flushes to occur. - */ -static int -i915_gem_object_set_cpu_read_domain_range(struct drm_i915_gem_object *obj, - uint64_t offset, uint64_t size) -{ - uint32_t old_read_domains; - int i, ret; - - if (offset == 0 size == obj-base.size) - return i915_gem_object_set_to_cpu_domain(obj, 0); - - ret = i915_gem_object_flush_gpu_write_domain(obj); - if (ret) - return ret; - - ret = i915_gem_object_wait_rendering(obj); - if (ret) - return ret; - - i915_gem_object_flush_gtt_write_domain(obj); - - /* If we're already fully in the CPU read domain, we're done. */ - if (obj-page_cpu_valid == NULL - (obj-base.read_domains I915_GEM_DOMAIN_CPU) != 0) - return 0; - - /* Otherwise, create/clear the per-page CPU read
[Intel-gfx] [PATCH 06/14] drm/i915: don't use gtt_pwrite on LLC cached objects
~120 µs instead fo ~210 µs to write 1mb on my snb. I like this. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d23d324..0446c4c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -828,6 +828,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, } if (obj-gtt_space + obj-cache_level == I915_CACHE_NONE obj-base.write_domain != I915_GEM_DOMAIN_CPU) { ret = i915_gem_object_pin(obj, 0, true); if (ret) -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/14] drm/i915: don't call shmem_read_mapping unnecessarily
This speeds up pwrite and pread from ~120 µs ro ~100 µs for reading/writing 1mb on my snb (if the backing storage pages are already pinned, of course). v2: Chris Wilson pointed out a claring page reference bug - I've unconditionally dropped the reference. With that fixed (and the associated reduction of dirt in dmesg) it's now even a notch faster. v3: Unconditionaly grab a page reference when dropping dev-struct_mutex to simplify the code-flow. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 42 +++--- 1 files changed, 30 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 0446c4c..0487889 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -321,6 +321,7 @@ i915_gem_shmem_pread(struct drm_device *dev, int obj_do_bit17_swizzling, page_do_bit17_swizzling; int hit_slowpath = 0; int needs_clflush = 0; + int release_page; user_data = (char __user *) (uintptr_t) args-data_ptr; remain = args-size; @@ -355,10 +356,16 @@ i915_gem_shmem_pread(struct drm_device *dev, if ((shmem_page_offset + page_length) PAGE_SIZE) page_length = PAGE_SIZE - shmem_page_offset; - page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); - if (IS_ERR(page)) { - ret = PTR_ERR(page); - goto out; + if (obj-pages) { + page = obj-pages[offset PAGE_SHIFT]; + release_page = 0; + } else { + page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); + if (IS_ERR(page)) { + ret = PTR_ERR(page); + goto out; + } + release_page = 1; } page_do_bit17_swizzling = obj_do_bit17_swizzling @@ -378,7 +385,7 @@ i915_gem_shmem_pread(struct drm_device *dev, } hit_slowpath = 1; - + page_cache_get(page); mutex_unlock(dev-struct_mutex); vaddr = kmap(page); @@ -397,9 +404,11 @@ i915_gem_shmem_pread(struct drm_device *dev, kunmap(page); mutex_lock(dev-struct_mutex); + page_cache_release(page); next_page: mark_page_accessed(page); - page_cache_release(page); + if (release_page) + page_cache_release(page); if (ret) { ret = -EFAULT; @@ -680,6 +689,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, int shmem_page_offset, page_length, ret; int obj_do_bit17_swizzling, page_do_bit17_swizzling; int hit_slowpath = 0; + int release_page; user_data = (char __user *) (uintptr_t) args-data_ptr; remain = args-size; @@ -704,10 +714,16 @@ i915_gem_shmem_pwrite(struct drm_device *dev, if ((shmem_page_offset + page_length) PAGE_SIZE) page_length = PAGE_SIZE - shmem_page_offset; - page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); - if (IS_ERR(page)) { - ret = PTR_ERR(page); - goto out; + if (obj-pages) { + page = obj-pages[offset PAGE_SHIFT]; + release_page = 0; + } else { + page = shmem_read_mapping_page(mapping, offset PAGE_SHIFT); + if (IS_ERR(page)) { + ret = PTR_ERR(page); + goto out; + } + release_page = 1; } page_do_bit17_swizzling = obj_do_bit17_swizzling @@ -725,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, } hit_slowpath = 1; - + page_cache_get(page); mutex_unlock(dev-struct_mutex); vaddr = kmap(page); @@ -740,10 +756,12 @@ i915_gem_shmem_pwrite(struct drm_device *dev, kunmap(page); mutex_lock(dev-struct_mutex); + page_cache_release(page); next_page: set_page_dirty(page); mark_page_accessed(page); - page_cache_release(page); + if (release_page) + page_cache_release(page); if (ret) { ret = -EFAULT; -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 13/14] drm/i915: use uncached writes in pwrite
It's around 20% faster. Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 48bef0b..9f49421 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -659,9 +659,9 @@ i915_gem_shmem_pwrite(struct drm_device *dev, if (partial_cacheline_write) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length); - ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, - user_data, - page_length); + ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, + user_data, + page_length); if (needs_clflush) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length); -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 14/14] drm/i915: extract copy helpers from shmem_pread|pwrite
While moving around things, this two functions slowly grew out of any sane bounds. So extract a few lines that do the copying and clflushing. Also add a few comments to explain what's going on. Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 192 +++ 1 files changed, 132 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9f49421..0328cb3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -307,6 +307,60 @@ __copy_from_user_swizzled(char __user *gpu_vaddr, int gpu_offset, return 0; } +/* Per-page copy function for the shmem pread fastpath. + * Flushes invalid cachelines before reading the target if + * needs_clflush is set. */ +static int +shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length, +char __user *user_data, +bool page_do_bit17_swizzling, bool needs_clflush) +{ + char *vaddr; + int ret; + + if (page_do_bit17_swizzling) + return -EINVAL; + + vaddr = kmap_atomic(page); + if (needs_clflush) + drm_clflush_virt_range(vaddr + shmem_page_offset, + page_length); + ret = __copy_to_user_inatomic(user_data, + vaddr + shmem_page_offset, + page_length); + kunmap_atomic(vaddr); + + return ret; +} + +/* Only difference to the fast-path function is that this can handle bit17 + * and uses non-atomic copy and kmap functions. */ +static int +shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length, +char __user *user_data, +bool page_do_bit17_swizzling, bool needs_clflush) +{ + char *vaddr; + int ret; + + vaddr = kmap(page); + if (needs_clflush) + drm_clflush_virt_range(vaddr + shmem_page_offset, + page_length); + + if (page_do_bit17_swizzling) + ret = __copy_to_user_swizzled(user_data, + vaddr, shmem_page_offset, + page_length); + else + ret = __copy_to_user(user_data, +vaddr + shmem_page_offset, +page_length); + kunmap(page); + + return ret; +} + static int i915_gem_shmem_pread(struct drm_device *dev, struct drm_i915_gem_object *obj, @@ -345,7 +399,6 @@ i915_gem_shmem_pread(struct drm_device *dev, while (remain 0) { struct page *page; - char *vaddr; /* Operation in this page * @@ -372,18 +425,11 @@ i915_gem_shmem_pread(struct drm_device *dev, page_do_bit17_swizzling = obj_do_bit17_swizzling (page_to_phys(page) (1 17)) != 0; - if (!page_do_bit17_swizzling) { - vaddr = kmap_atomic(page); - if (needs_clflush) - drm_clflush_virt_range(vaddr + shmem_page_offset, - page_length); - ret = __copy_to_user_inatomic(user_data, - vaddr + shmem_page_offset, - page_length); - kunmap_atomic(vaddr); - if (ret == 0) - goto next_page; - } + ret = shmem_pread_fast(page, shmem_page_offset, page_length, + user_data, page_do_bit17_swizzling, + needs_clflush); + if (ret == 0) + goto next_page; hit_slowpath = 1; page_cache_get(page); @@ -399,20 +445,9 @@ i915_gem_shmem_pread(struct drm_device *dev, prefaulted = 1; } - vaddr = kmap(page); - if (needs_clflush) - drm_clflush_virt_range(vaddr + shmem_page_offset, - page_length); - - if (page_do_bit17_swizzling) - ret = __copy_to_user_swizzled(user_data, - vaddr, shmem_page_offset, - page_length); - else - ret = __copy_to_user(user_data, -vaddr + shmem_page_offset, -page_length); - kunmap(page); + ret = shmem_pread_slow(page, shmem_page_offset, page_length, +
Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support
On Thu, 16 Feb 2012 13:04:14 +0100 Ben Widawsky b...@bwidawsk.net wrote: On Wed, 15 Feb 2012 12:33:38 -0800 Eric Anholt e...@anholt.net wrote: On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net wrote: These patches are a heavily revised version of the patches I wrote over a year ago. These patches have passed basic tests on SNB, and IVB, and older versions worked on ILK. In theory, context support should work all the way back to Gen4, but I haven't tested it. Also since I suspect ILK may be unstable, so the code has it disabled for now. HW contexts provide a way for the GPU to save an restore certain state in between batchbuffer boundaries. Typically, GPU clients must re-emit the entire state every time they run because the client does not know what has been destroyed since the last time. With these patches the driver will emit special instructions to do this on behalf of the client if it has registered a context, and included that with the batchbuffer. These patches look pretty solid. In particular, the API (create/destroy/context id in execbuf) looks like just what we want for Mesa. I'll try to get around to testing it out soon (I'm poking at some performance stuff currently where this might become relevant soon). I've just started noticing GPU hangs with Ken's test mesa branch on nexuiz with vsync, full 13x7 and max effects. It seems to work fine with variations like windowed, lower detail, etc. Although it looks weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen the hangs before this morning, and I'm not sure what has changed. So FYI, you may want to start out with IVB (unless you want to help me figure out what is broken on SNB :-) I've not tried very hard, but so far it only seems to occur when doing context switches, however MI_SET_CONTEXT is nowhere in the error state. Seems like sandybridge + fullscreen nexuiz is the exact fail combo. The couple of patches without a comment from me are: Reviewed-by: Eric Anholt e...@anholt.net ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] i915 SNB: hotplug events when charging causing poor interactivity
On Thu, Feb 16, 2012 at 11:44:46AM +, Daniel J Blueman wrote: When charging Dell E5420 laptops, I see the Sandy Bridge South Display Engine port C hotplug interrupt fire consistently at around 20Hz. Each scan on the connectors results in ~100ms hold time for the mode_config mutex which blocks eg the cursor set ioctl (due to the I2C read timeouts across the connectors), resulting in terrible GUI interactivity. We know about this locking issue - it's much worse on platforms where we need to do load-detect hotplug detection or if for whatever reasons it takes ages to grab the edid from your screen. The Great Plan (tm) is to add a per-crtc mutex so that cursor updates and pageflips can continue while someone else is holding the config_mutex to do background stuff like hotplug handling. Unfortunately there's tons of other important stuff on my todo :( Notably, when only using battery or only using AC, the port C interrupts don't fire. It feels like the platform is using South Display Port C for GPIO/I2C; setting the port C pulse duration from 2ms to 100ms doesn't change the behaviour. I'll dump off the GPIO settings, but what else is good to debug this? Yep, dumping the register state (intel_reg_dumper from intel-gpu-tools is handy for that) in the different situations sounds useful. Also, have you come across this kind of pattern before, eg a platform using these GPIO ports for the something (in this case, feels like the EC/battery)? Judging by the lack of a quirks, I'd say not. Perhaps also I should dump off the SDE port C IIR mask register before we reset it, in case the BIOS intentionally masks out port C hotplug events. If this is indeed the bios we need to quirk this away. But I think we should check first whether we don't butcher something else accidently. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/11] i915 HW context support
On Thu, 16 Feb 2012 13:21:42 +0100 Ben Widawsky b...@bwidawsk.net wrote: On Thu, 16 Feb 2012 13:04:14 +0100 Ben Widawsky b...@bwidawsk.net wrote: On Wed, 15 Feb 2012 12:33:38 -0800 Eric Anholt e...@anholt.net wrote: On Tue, 14 Feb 2012 22:09:07 +0100, Ben Widawsky b...@bwidawsk.net wrote: These patches are a heavily revised version of the patches I wrote over a year ago. These patches have passed basic tests on SNB, and IVB, and older versions worked on ILK. In theory, context support should work all the way back to Gen4, but I haven't tested it. Also since I suspect ILK may be unstable, so the code has it disabled for now. HW contexts provide a way for the GPU to save an restore certain state in between batchbuffer boundaries. Typically, GPU clients must re-emit the entire state every time they run because the client does not know what has been destroyed since the last time. With these patches the driver will emit special instructions to do this on behalf of the client if it has registered a context, and included that with the batchbuffer. These patches look pretty solid. In particular, the API (create/destroy/context id in execbuf) looks like just what we want for Mesa. I'll try to get around to testing it out soon (I'm poking at some performance stuff currently where this might become relevant soon). I've just started noticing GPU hangs with Ken's test mesa branch on nexuiz with vsync, full 13x7 and max effects. It seems to work fine with variations like windowed, lower detail, etc. Although it looks weird on IVB, I cannot reproduce the hangs there. Also, I'd never seen the hangs before this morning, and I'm not sure what has changed. So FYI, you may want to start out with IVB (unless you want to help me figure out what is broken on SNB :-) I've not tried very hard, but so far it only seems to occur when doing context switches, however MI_SET_CONTEXT is nowhere in the error state. Seems like sandybridge + fullscreen nexuiz is the exact fail combo. So here was part of Ken's patch - brw-state.dirty.brw |= BRW_NEW_CONTEXT | BRW_NEW_BATCH; + if (intel-hw_ctx == NULL) + brw-state.dirty.brw |= BRW_NEW_CONTEXT; + + brw-state.dirty.brw |= BRW_NEW_BATCH; If I go back to normal and use contexts, but also set BRW_NEW_CONTEXT, I don't get hangs. So it sounds like some part of the state isn't being saved or restored properly. Still trying to figure out what changed to make it break. As a side note, IVB has weird artifacts which I thought were just because I was using an experimental mesa, but now think it's likely the same cause. Maybe you can figure out what isn't be saved or restored based on seeing IVB? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
On Thu, Feb 16, 2012 at 09:32:08PM +0800, Hillf Danton wrote: On Thu, Feb 16, 2012 at 8:01 PM, Daniel Vetter daniel.vet...@ffwll.ch wrote: @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); + while (uaddr = end) { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } What if uaddr ~PAGE_MASK == PAGE_SIZE -3 end ~PAGE_MASK == 2 I don't quite follow - can you elaborate upon which issue you're seeing? -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] updated -next
Hi all, Updated -next and -testing trees. I haven't merged in any of the patches Jesse queued up because he hasn't yet pushed out his latest -fixes tree. No cookies for Jesse today! Highlights: - interlaced support for i915. Again thanks a lot to all the ppl who help out with testing, patches and doc-crawling. - aliasing ppgtt support for snb/ivb. Because ppgtt ptes are gpu-cacheable, this can also speed things up a bit. - swizzling support for snb/ivb, again a slight perf improvements on some things. - more error_state work - we're slowly reaching a level of paranoia suitable for dealing with gpus. - outstanding_lazy_request fix and the autoreport patches from Chris: I'm pretty hopefully that these two squash a lot of the semaphores=1 issues we've seen on snb, please retest if you've had issues. - the usual pile of minor patches, one noteworthy one is to use the lvds presence ping on pch_split chips. I expect a few new quirks due to this ... Go forth and test! Cheers, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support
On Thu, Feb 16, 2012 at 08:48:07AM -0800, Eric Anholt wrote: On Thu, 16 Feb 2012 13:11:31 +0100, Daniel Vetter daniel.vet...@ffwll.ch wrote: No longer needed. What this code was for: Before gtt mapping, we were doing software fallbacks in Mesa with pread/write on pages at a time (or worse) of the framebuffer. It would regularly result in hitting the same page again, since I was only caching the last page I'd pulled out, instead of keeping a whole copy of the framebuffer during the fallback. Urgh, that's not really efficient ;-) I think for s/w fallbacks and readbacks we can presume decent damage tracking (like sna does) on the userspace sides. Since we've been doing gtt mapping for years at this point, I'm happy to see the code die. I'm not sure about the rest of the code. In particular, for the code that's switching between gtt and cpu mappings to handle a read/write, I'm concerned about whether the behavior matches for tiled objects. I haven't reviewed enough to be sure. Behaviour should match old code if you read/write entire pages (we should have decent set of tests for that). If you do non-cacheline-aligned reads/writes on tiled objects, we might hit an issue, but they should be solveable (I've simply been too lazy to write testcases for this). Not cacheline aligned reads/writes on untiled also work, I've created a set of tests to exercise issues there (and tested the tests by omitting some of the clflushes, i.e. all the clfushes now left _are_ required). If old mesa depends on sub-page reads/writes to tiled objects I need to create the respective tests and double-check the code, otherwise I think we're covered. Do you want me to adapt the tests to check correctness for sub-page reads/writes to tiled objects? Thanks, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/14] drm/i915: kill ranged cpu read domain support
On Thu, 16 Feb 2012 18:38:08 +0100, Daniel Vetter dan...@ffwll.ch wrote: If old mesa depends on sub-page reads/writes to tiled objects I need to create the respective tests and double-check the code, otherwise I think we're covered. Do you want me to adapt the tests to check correctness for sub-page reads/writes to tiled objects? I think that you've cast enough doubt in our minds that we would like to see the corresponding sub-page tests. ;-) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] linux driver support GVA3650 ?
-Original Message- From: intel-gfx-bounces+gordon.jin=intel@lists.freedesktop.org [mailto:intel-gfx-bounces+gordon.jin=intel@lists.freedesktop.org] On Behalf Of Daniel Vetter Sent: Wednesday, February 15, 2012 8:54 PM To: Kevin Kuei Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] linux driver support GVA3650 ? On Wed, Feb 15, 2012 at 05:26:51PM +0800, Kevin Kuei wrote: Hello all, We are planing to use Intel D2700 CPU (Cedar Trail) as the solution of our product, but I can NOT find the linux graphic driver. I searched it in: http://intellinuxgraphics.org/documentation.html and the mailling list archieve. It's not Intel GPU and not supported by the project here (including the website and mailing list) Is there anyone can kindly tell me where can I get the driver? or is Intel have the plan to develop the linux driver?? Cedar Trail driver was just released last week, into MeeGo 1.2 for Netbook: https://meego.com/downloads/releases/1.2/meego-v1.2-netbooks It consists of a kernel driver and a (closed-source) user space driver. cedar trail contains a pvr chip as the gpu core and is very much not supported by the open source intel linux graphics team. You're on your own, essentially :( ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx