Re: [Intel-gfx] [PATCH] drm/i915: Set guardband clipping workaround bit in the right register.
On Sun, 7 Oct 2012 08:51:07 -0700, Kenneth Graunke kenn...@whitecape.org wrote: A previous patch, namely: commit bf97b276ca04cee9ab65ffd378fa8e6aedd71ff6 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Apr 11 20:42:41 2012 +0200 drm/i915: implement w/a for incorrect guarband clipping accidentally set bit 5 in 3D_CHICKEN, which has nothing to do with clipping. This patch changes it to be set in 3D_CHICKEN3, where it belongs. The game Dante demonstrates random clipping issues when guardband clipping is enabled and bit 5 of 3D_CHICKEN3 isn't set. So the workaround is actually necessary. Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Oliver McFadden oliver.mcfad...@linux.intel.com Acked-by: Paul Menzel paulepan...@users.sourceforge.net Signed-off-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Mika Kuoppala mika.kuopp...@intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 05ff790..c1ef0a8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2771,9 +2771,22 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, POSTING_READ(FENCE_REG_830_0 + reg * 4); } +inline static bool i915_gem_object_needs_mb(struct drm_i915_gem_object *obj) +{ + return obj obj-base.read_domains I915_GEM_DOMAIN_GTT; +} + static void i915_gem_write_fence(struct drm_device *dev, int reg, struct drm_i915_gem_object *obj) { + struct drm_i915_private *dev_priv = dev-dev_private; + + /* Ensure that all CPU reads are completed before installing a fence +* and all writes before removing the fence. +*/ + if (i915_gem_object_needs_mb(dev_priv-fence_regs[reg].obj)) + mb(); + switch (INTEL_INFO(dev)-gen) { case 7: case 6: sandybridge_write_fence_reg(dev, reg, obj); break; @@ -2783,6 +2796,9 @@ static void i915_gem_write_fence(struct drm_device *dev, int reg, case 2: i830_write_fence_reg(dev, reg, obj); break; default: break; } + + if (i915_gem_object_needs_mb(obj)) + mb(); } static inline int fence_number(struct drm_i915_private *dev_priv, @@ -2812,7 +2828,7 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, } static int -i915_gem_object_flush_fence(struct drm_i915_gem_object *obj) +i915_gem_object_wait_fence(struct drm_i915_gem_object *obj) { if (obj-last_fenced_seqno) { int ret = i915_wait_seqno(obj-ring, obj-last_fenced_seqno); @@ -2822,12 +2838,6 @@ i915_gem_object_flush_fence(struct drm_i915_gem_object *obj) obj-last_fenced_seqno = 0; } - /* Ensure that all CPU reads are completed before installing a fence -* and all writes before removing the fence. -*/ - if (obj-base.read_domains I915_GEM_DOMAIN_GTT) - mb(); - obj-fenced_gpu_access = false; return 0; } @@ -2838,7 +2848,7 @@ i915_gem_object_put_fence(struct drm_i915_gem_object *obj) struct drm_i915_private *dev_priv = obj-base.dev-dev_private; int ret; - ret = i915_gem_object_flush_fence(obj); + ret = i915_gem_object_wait_fence(obj); if (ret) return ret; @@ -2912,7 +2922,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj) * will need to serialise the write to the associated fence register? */ if (obj-fence_dirty) { - ret = i915_gem_object_flush_fence(obj); + ret = i915_gem_object_wait_fence(obj); if (ret) return ret; } @@ -2933,7 +2943,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj) if (reg-obj) { struct drm_i915_gem_object *old = reg-obj; - ret = i915_gem_object_flush_fence(old); + ret = i915_gem_object_wait_fence(old); if (ret) return ret; @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_cpu_write_domain(obj); + if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) + mb(); + old_write_domain = obj-base.write_domain; old_read_domains = obj-base.read_domains; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set guardband clipping workaround bit in the right register.
On Tue, Oct 09, 2012 at 10:43:10AM +0300, Mika Kuoppala wrote: On Sun, 7 Oct 2012 08:51:07 -0700, Kenneth Graunke kenn...@whitecape.org wrote: A previous patch, namely: commit bf97b276ca04cee9ab65ffd378fa8e6aedd71ff6 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Wed Apr 11 20:42:41 2012 +0200 drm/i915: implement w/a for incorrect guarband clipping accidentally set bit 5 in 3D_CHICKEN, which has nothing to do with clipping. This patch changes it to be set in 3D_CHICKEN3, where it belongs. The game Dante demonstrates random clipping issues when guardband clipping is enabled and bit 5 of 3D_CHICKEN3 isn't set. So the workaround is actually necessary. Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Oliver McFadden oliver.mcfad...@linux.intel.com Acked-by: Paul Menzel paulepan...@users.sourceforge.net Signed-off-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Mika Kuoppala mika.kuopp...@intel.com Applied to -fixes, thanks for the patchreview. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Explicitly reset the seqno upon idling
During execbuffer emission we assert that we do not wrap around the seqno used for semaphore breadcrumbs. However: :kernel BUG at drivers/gpu/drm/i915/i915_gem_execbuffer.c:1239! :invalid opcode: [#1] SMP :CPU 0 :Modules linked in: usb_storage usblp bnep lockd sunrpc bluetooth rfkill vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi snd_hda_codec_realtek coretemp mei lpc_ich snd_hda_intel mfd_core i2c_i801 kvm snd_hda_codec snd_hwdep e1000e microcode snd_pcm snd_page_alloc snd_timer snd soundcore serio_raw uinput binfmt_misc crc32c_intel ghash_clmulni_intel wmi i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] :Pid: 962, comm: X Tainted: G C O 3.5.4-1.fc17.x86_64 #1 LENOVO 5032AJ3/ :RIP: 0010:[a008842c] [a008842c] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] :RSP: 0018:8801214a3c08 EFLAGS: 00010286 :RAX: RBX: RCX: dead00200200 :RDX: 880133561988 RSI: fe0c RDI: 88000248f8b0 :RBP: 8801214a3d28 R08: 88000248f8b0 R09: 000180400032 :R10: 3288eb01 R11: 8801214a3fd8 R12: 8801335618b0 :R13: 880133ebd800 R14: 0001 R15: 8801214a3bf8 :FS: 7f5f0647a8c0() GS:88013e20() knlGS: :CS: 0010 DS: ES: CR0: 80050033 :CR2: 03481000 CR3: 00013417 CR4: 000407f0 :DR0: DR1: DR2: :DR3: DR6: 0ff0 DR7: 0400 :Process X (pid: 962, threadinfo 8801214a2000, task 880123afae20) :Stack: : : 880123851cc0 0001 8801214a3c88 a0082eb4 : 88010002 00c03356 8801214a3c88 880123851ce0 :Call Trace: : [a0082eb4] ? i915_gem_object_set_to_gtt_domain+0x104/0x1a0 [i915] : [a0089031] i915_gem_execbuffer2+0xb1/0x290 [i915] : [a00154f3] drm_ioctl+0x4d3/0x580 [drm] : [a0088f80] ? i915_gem_execbuffer+0x480/0x480 [i915] : [81188d5e] ? do_readv_writev+0x18e/0x1e0 : [81199919] do_vfs_ioctl+0x99/0x580 : [8127973a] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30 : [8127ad17] ? file_has_perm+0x97/0xb0 : [81199e99] sys_ioctl+0x99/0xa0 : [81614e29] system_call_fastpath+0x16/0x1b :Code: 59 fd ff ff 4c 89 ef e8 23 a4 ff ff 85 c0 90 0f 85 ad fc ff ff 4c 89 ef e8 82 9d ff ff 45 8b 4c 24 6c 45 85 c9 0f 84 02 fd ff ff 0f 0b 4c 89 ef e8 fa a3 ff ff 85 c0 0f 85 85 fc ff ff 4c 89 ef :RIP [a008842c] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] : RSP 8801214a3c08 clearly shows us hitting this supposedly impossible wraparound. The cause here is that after idling, retire-requests only resets the breadcrumbs if there was a request on the ring. To avoid this after idling, we can simply clear the breadcrumbs. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 288d7b8..95c0cd0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2163,6 +2163,15 @@ static int i915_ring_idle(struct intel_ring_buffer *ring) return i915_wait_request(ring, i915_gem_next_request_seqno(ring)); } +static void i915_ring_reset_seqno(struct intel_ring_buffer *ring) +{ + int i; + + for (i = 0; i ARRAY_SIZE(ring-sync_seqno); i++) + if (seqno = ring-sync_seqno[i]) + ring-sync_seqno[i] = 0; +} + int i915_gpu_idle(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev-dev_private; @@ -2178,6 +2187,8 @@ int i915_gpu_idle(struct drm_device *dev) /* Is the device fubar? */ if (WARN_ON(!list_empty(ring-gpu_write_list))) return -EBUSY; + + i915_ring_reset_seqno(ring); } return 0; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Explicitly reset the seqno upon idling
During execbuffer emission we assert that we do not wrap around the seqno used for semaphore breadcrumbs. However: :kernel BUG at drivers/gpu/drm/i915/i915_gem_execbuffer.c:1239! :invalid opcode: [#1] SMP :CPU 0 :Modules linked in: usb_storage usblp bnep lockd sunrpc bluetooth rfkill vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi snd_hda_codec_realtek coretemp mei lpc_ich snd_hda_intel mfd_core i2c_i801 kvm snd_hda_codec snd_hwdep e1000e microcode snd_pcm snd_page_alloc snd_timer snd soundcore serio_raw uinput binfmt_misc crc32c_intel ghash_clmulni_intel wmi i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] :Pid: 962, comm: X Tainted: G C O 3.5.4-1.fc17.x86_64 #1 LENOVO 5032AJ3/ :RIP: 0010:[a008842c] [a008842c] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] :RSP: 0018:8801214a3c08 EFLAGS: 00010286 :RAX: RBX: RCX: dead00200200 :RDX: 880133561988 RSI: fe0c RDI: 88000248f8b0 :RBP: 8801214a3d28 R08: 88000248f8b0 R09: 000180400032 :R10: 3288eb01 R11: 8801214a3fd8 R12: 8801335618b0 :R13: 880133ebd800 R14: 0001 R15: 8801214a3bf8 :FS: 7f5f0647a8c0() GS:88013e20() knlGS: :CS: 0010 DS: ES: CR0: 80050033 :CR2: 03481000 CR3: 00013417 CR4: 000407f0 :DR0: DR1: DR2: :DR3: DR6: 0ff0 DR7: 0400 :Process X (pid: 962, threadinfo 8801214a2000, task 880123afae20) :Stack: : : 880123851cc0 0001 8801214a3c88 a0082eb4 : 88010002 00c03356 8801214a3c88 880123851ce0 :Call Trace: : [a0082eb4] ? i915_gem_object_set_to_gtt_domain+0x104/0x1a0 [i915] : [a0089031] i915_gem_execbuffer2+0xb1/0x290 [i915] : [a00154f3] drm_ioctl+0x4d3/0x580 [drm] : [a0088f80] ? i915_gem_execbuffer+0x480/0x480 [i915] : [81188d5e] ? do_readv_writev+0x18e/0x1e0 : [81199919] do_vfs_ioctl+0x99/0x580 : [8127973a] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30 : [8127ad17] ? file_has_perm+0x97/0xb0 : [81199e99] sys_ioctl+0x99/0xa0 : [81614e29] system_call_fastpath+0x16/0x1b :Code: 59 fd ff ff 4c 89 ef e8 23 a4 ff ff 85 c0 90 0f 85 ad fc ff ff 4c 89 ef e8 82 9d ff ff 45 8b 4c 24 6c 45 85 c9 0f 84 02 fd ff ff 0f 0b 4c 89 ef e8 fa a3 ff ff 85 c0 0f 85 85 fc ff ff 4c 89 ef :RIP [a008842c] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] : RSP 8801214a3c08 clearly shows us hitting this supposedly impossible wraparound. The cause here is that after idling, retire-requests only resets the breadcrumbs if there was a request on the ring. To avoid this after idling, we can simply clear the breadcrumbs. v2: Remember to save before sending. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 288d7b8..332625f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2163,6 +2163,14 @@ static int i915_ring_idle(struct intel_ring_buffer *ring) return i915_wait_request(ring, i915_gem_next_request_seqno(ring)); } +static void i915_ring_reset_seqno(struct intel_ring_buffer *ring) +{ + int i; + + for (i = 0; i ARRAY_SIZE(ring-sync_seqno); i++) + ring-sync_seqno[i] = 0; +} + int i915_gpu_idle(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev-dev_private; @@ -2178,6 +2186,8 @@ int i915_gpu_idle(struct drm_device *dev) /* Is the device fubar? */ if (WARN_ON(!list_empty(ring-gpu_write_list))) return -EBUSY; + + i915_ring_reset_seqno(ring); } return 0; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
On Tue, Oct 09, 2012 at 09:38:58AM +0100, Chris Wilson wrote: With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Looks good and finally puts some clear explanation and consistency behind our mb()s. Two minor nitpicks, otherwise. Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) [snip] @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_cpu_write_domain(obj); + if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) + mb(); + I think a comment here like we have one for all other gtt related memory barries would be good. Another thing is the flush_gtt_write_domain uses a wmb, whereas here we don't bother with micro-optimizing things. So I think it'd be good to just use a mb() for that, too, if just for consistency. Also, you know the grumpy maintainer drill: Could we exercise these barriers with a minimal i-g-t testcase, please? Since you've managed to kill your machine by removing them, they're no longer just there to keep us happy, hence I'd like to have them exercised ... Another thing that just crossed my mind: Could we lack a set of mb()s for cpu access on llc platforms? For non-coherent platforms the mb() in the clflush paths will do that, but on llc platforms I couldn't find anything. And that lp bugs seems to make an excellent case for them being required ... Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
While looking at barriers, I think we could be a bit more paranoid with the barrier in intel_read_status_page and up it to a full mb() and move it into the !lazy_coherency conditional of the various get_seqno functions. -Daniel On Tue, Oct 09, 2012 at 11:54:12AM +0200, Daniel Vetter wrote: On Tue, Oct 09, 2012 at 09:38:58AM +0100, Chris Wilson wrote: With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Looks good and finally puts some clear explanation and consistency behind our mb()s. Two minor nitpicks, otherwise. Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) [snip] @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_cpu_write_domain(obj); + if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) + mb(); + I think a comment here like we have one for all other gtt related memory barries would be good. Another thing is the flush_gtt_write_domain uses a wmb, whereas here we don't bother with micro-optimizing things. So I think it'd be good to just use a mb() for that, too, if just for consistency. Also, you know the grumpy maintainer drill: Could we exercise these barriers with a minimal i-g-t testcase, please? Since you've managed to kill your machine by removing them, they're no longer just there to keep us happy, hence I'd like to have them exercised ... Another thing that just crossed my mind: Could we lack a set of mb()s for cpu access on llc platforms? For non-coherent platforms the mb() in the clflush paths will do that, but on llc platforms I couldn't find anything. And that lp bugs seems to make an excellent case for them being required ... Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 1/4] time: export getnstime_raw_and_real for DRM
On Fri, 2012-10-05 at 12:14 -0400, Kristian Høgsberg wrote: On Fri, Oct 5, 2012 at 9:36 AM, Imre Deak imre.d...@intel.com wrote: Needed by the upcoming DRM raw monotonic timestamp support. I just had a quick look at driver/input/evdev.c, since evdev devices did a similar change recently to allow evdev timestamp from the monotonic clock. They're using a different time API: time_mono = ktime_get(); time_real = ktime_sub(time_mono, ktime_get_monotonic_offset()); and event-time = ktime_to_timeval(client-clkid == CLOCK_MONOTONIC ? mono : real); I'm not really up-to-date on kernel time APIs, but I wonder if that may be better? At least, I suspect we wouldn't need changes outside drm if we use this API. Thanks, I will use this instead of getnstime_raw_and_real. The reason is as discussed on #intel-gfx that this provides a monotonic vs. raw monotonic time and that as you say it won't require a change in the time keeping code. --Imre Kristian Signed-off-by: Imre Deak imre.d...@intel.com --- kernel/time/timekeeping.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index d3b91e7..073d262 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -365,7 +365,7 @@ void ktime_get_ts(struct timespec *ts) } EXPORT_SYMBOL_GPL(ktime_get_ts); -#ifdef CONFIG_NTP_PPS +#if defined(CONFIG_NTP_PPS) || defined(CONFIG_DRM) || defined(CONFIG_DRM_MODULE) /** * getnstime_raw_and_real - get day and raw monotonic time in timespec format -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
On Tue, 9 Oct 2012 11:54:12 +0200, Daniel Vetter dan...@ffwll.ch wrote: On Tue, Oct 09, 2012 at 09:38:58AM +0100, Chris Wilson wrote: @@ -3244,6 +3254,9 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_cpu_write_domain(obj); + if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) + mb(); + I think a comment here like we have one for all other gtt related memory barries would be good. Another thing is the flush_gtt_write_domain uses a wmb, whereas here we don't bother with micro-optimizing things. So I think it'd be good to just use a mb() for that, too, if just for consistency. In fact, not only is that the wmb() the easiest to micro-optimise, it is the only one that can be - I think. Around manipulating the fence, we need a read/write barrier in case we have any pending accesses through the fenced region, since the register write may be reordered passed the memory reads since there is no obvious dependency. That might just be heightened paranoia and our memory controller isn't that smart. Yet. So those two need to be mb() so that I can sleep safely at night. For the mb() inside set-to-gtt-domain, I don't have a robust explanation other than that empirically we need a barrier, therefore there is some lingering incoherency when reusing a bo. (The hangs always seem to occur when crossing a page boundary, we see stale data.) You could attempt to insert a read/write barrier depending upon actual usage, but it hardly seems worth the effort. Also, you know the grumpy maintainer drill: Could we exercise these barriers with a minimal i-g-t testcase, please? Since you've managed to kill your machine by removing them, they're no longer just there to keep us happy, hence I'd like to have them exercised ... Still hunting. Another thing that just crossed my mind: Could we lack a set of mb()s for cpu access on llc platforms? For non-coherent platforms the mb() in the clflush paths will do that, but on llc platforms I couldn't find anything. And that lp bugs seems to make an excellent case for them being required ... Yes, with LLC we need to treat the GPU as another core and so put similar SMP-esque memory barriers in place. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
On Tue, Oct 9, 2012 at 1:03 PM, Chris Wilson ch...@chris-wilson.co.uk wrote: In fact, not only is that the wmb() the easiest to micro-optimise, it is the only one that can be - I think. Around manipulating the fence, we need a read/write barrier in case we have any pending accesses through the fenced region, since the register write may be reordered passed the memory reads since there is no obvious dependency. That might just be heightened paranoia and our memory controller isn't that smart. Yet. So those two need to be mb() so that I can sleep safely at night. For the mb() inside set-to-gtt-domain, I don't have a robust explanation other than that empirically we need a barrier, therefore there is some lingering incoherency when reusing a bo. (The hangs always seem to occur when crossing a page boundary, we see stale data.) You could attempt to insert a read/write barrier depending upon actual usage, but it hardly seems worth the effort. Hm, I think we can make a case that the barrier before the fence change can only be a wmb(): Racing reads against fence changes is ill-defined anyway, so we don't need the read barrier for that reason. But we need to flush out any store buffers (especially the wc store buffer) before changing the fencing. The mb() afterwards seems to be required, since we need to sync both subsequent reads and writes against the fence mmio write. One thing I wonder is whether we miss any barrier between the wc writes to the ringbuffer and the tail update. If that's the case I wonder where all the bug reports are ... Last one: Which machines blow up when you drop that mb()? Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm: Add missing break in the command line mode parsing code
From: Damien Lespiau damien.lesp...@intel.com As we parse the string given on the command line one char at a time, it seems that we do want a break at every case. Signed-off-by: Damien Lespiau damien.lesp...@intel.com --- drivers/gpu/drm/drm_modes.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c index 28637c1..50bd5c1 100644 --- a/drivers/gpu/drm/drm_modes.c +++ b/drivers/gpu/drm/drm_modes.c @@ -1054,6 +1054,7 @@ bool drm_mode_parse_command_line_for_connector(const char *mode_option, was_digit = false; } else goto done; + break; case '0' ... '9': was_digit = true; break; -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only insert the mb() before updating the fence parameter
On Tue, 9 Oct 2012 13:14:09 +0200, Daniel Vetter dan...@ffwll.ch wrote: One thing I wonder is whether we miss any barrier between the wc writes to the ringbuffer and the tail update. If that's the case I wonder where all the bug reports are ... Ditto. I've often wondered how we get away without a wmb() there... Last one: Which machines blow up when you drop that mb()? pnv, though that's the only non-LLC I've been testing with the incomplete patch so I can't say it is limited to that machine. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Run hangcheck if we timeout whilst waiting for a seqno
As a precaution against the driver fouling up and missing a hang leaving the caller in an indefinite wait, manually inspect for a GPU hang if we timeout whilst waiting for a seqno. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d11c489..74b59f9 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1154,6 +1154,9 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, end = wait_event_timeout(ring-irq_queue, EXIT_COND, timeout_jiffies); + /* Be paranoid and check that we haven't missed a GPU hang */ + if (end == 0) + i915_hangcheck_elapsed((unsigned long)dev_priv-dev); ret = i915_gem_check_wedge(dev_priv, interruptible); if (ret) end = ret; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] scaling on ivy bridge
I have a display that is, e.g., 2560x1600 on an ivybridge. I'd like to set it up so that the software sees a display with half that resolution, i.e. the software that draws into the frame buffer (this is a bios) can draw into what it sees as a 1280x800 display. I've tried a few things but I'm realizing I don't understand the ivybridge panel fitter at all. This display is on PIPEA. It appears on sandybridge one can write PFA_VSCALE, PFA_HSCALE, PFA_WIN_SZ, and PFA_WIN_POS and _PFA_CTL_1 and it all works. On Ivybridge it seems far more complex. Or am I missing something? I don't see any fitting code in the drm driver itself, so I'm not quite sure how it's done. Thanks in advance, as always. ron ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Missed ret initialization on gem_context_size
Although default case is a bug, being conservative on the initialization helps to shut up coverity scan. Signed-off-by: Rodrigo Vivi rodrigo.v...@gmail.com --- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 4aa7ecf..bf78882 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -102,7 +102,7 @@ static int do_switch(struct i915_hw_context *to); static int get_context_size(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; - int ret; + int ret = 0; u32 reg; switch (INTEL_INFO(dev)-gen) { -- 1.7.11.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Missed ret initialization on gem_context_size
On Tue, 9 Oct 2012 14:49:06 -0300, Rodrigo Vivi rodrigo.v...@gmail.com wrote: Although default case is a bug, being conservative on the initialization helps to shut up coverity scan. So many more issues would be resolved if you taught coverity that BUG() was noreturn. :-p -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/4] drm/i915: Only insert the mb() before updating the fence parameter
With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. v2: Add a couple more comments to explain the new barriers Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c | 40 +-- 1 file changed, 30 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 05ff790..3c4577b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2771,9 +2771,22 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, POSTING_READ(FENCE_REG_830_0 + reg * 4); } +inline static bool i915_gem_object_needs_mb(struct drm_i915_gem_object *obj) +{ + return obj obj-base.read_domains I915_GEM_DOMAIN_GTT; +} + static void i915_gem_write_fence(struct drm_device *dev, int reg, struct drm_i915_gem_object *obj) { + struct drm_i915_private *dev_priv = dev-dev_private; + + /* Ensure that all CPU reads are completed before installing a fence +* and all writes before removing the fence. +*/ + if (i915_gem_object_needs_mb(dev_priv-fence_regs[reg].obj)) + mb(); + switch (INTEL_INFO(dev)-gen) { case 7: case 6: sandybridge_write_fence_reg(dev, reg, obj); break; @@ -2783,6 +2796,12 @@ static void i915_gem_write_fence(struct drm_device *dev, int reg, case 2: i830_write_fence_reg(dev, reg, obj); break; default: break; } + + /* And similarly be paranoid that no direct access to this region +* is reordered to before the fence is installed. +*/ + if (i915_gem_object_needs_mb(obj)) + mb(); } static inline int fence_number(struct drm_i915_private *dev_priv, @@ -2812,7 +2831,7 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj, } static int -i915_gem_object_flush_fence(struct drm_i915_gem_object *obj) +i915_gem_object_wait_fence(struct drm_i915_gem_object *obj) { if (obj-last_fenced_seqno) { int ret = i915_wait_seqno(obj-ring, obj-last_fenced_seqno); @@ -2822,12 +2841,6 @@ i915_gem_object_flush_fence(struct drm_i915_gem_object *obj) obj-last_fenced_seqno = 0; } - /* Ensure that all CPU reads are completed before installing a fence -* and all writes before removing the fence. -*/ - if (obj-base.read_domains I915_GEM_DOMAIN_GTT) - mb(); - obj-fenced_gpu_access = false; return 0; } @@ -2838,7 +2851,7 @@ i915_gem_object_put_fence(struct drm_i915_gem_object *obj) struct drm_i915_private *dev_priv = obj-base.dev-dev_private; int ret; - ret = i915_gem_object_flush_fence(obj); + ret = i915_gem_object_wait_fence(obj); if (ret) return ret; @@ -2912,7 +2925,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj) * will need to serialise the write to the associated fence register? */ if (obj-fence_dirty) { - ret = i915_gem_object_flush_fence(obj); + ret = i915_gem_object_wait_fence(obj); if (ret) return ret; } @@ -2933,7 +2946,7 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj) if (reg-obj) { struct drm_i915_gem_object *old = reg-obj; - ret = i915_gem_object_flush_fence(old); + ret = i915_gem_object_wait_fence(old); if (ret) return ret; @@ -3244,6 +3257,13 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write) i915_gem_object_flush_cpu_write_domain(obj); + /* Serialise direct access to this object with the barriers for +* coherent writes from the GPU, by effectively invalidating the +* GTT domain upon first access. +*/ + if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) + mb(); + old_write_domain = obj-base.write_domain; old_read_domains = obj-base.read_domains; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/4] drm/i915: Only apply the mb() when flushing the GTT domain during a finish
Now that we seem to have brought order to the GTT barriers, the last one to review is the terminal barrier before we unbind the buffer from the GTT. This needs to only be performed if the buffer still resides in the GTT domain, and so we can skip some needless barriers otherwise. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3c4577b..ed8d21a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2526,15 +2526,15 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) { u32 old_write_domain, old_read_domains; - /* Act a barrier for all accesses through the GTT */ - mb(); - /* Force a pagefault for domain tracking on next user access */ i915_gem_release_mmap(obj); if ((obj-base.read_domains I915_GEM_DOMAIN_GTT) == 0) return; + /* Wait for any direct GTT access to complete */ + mb(); + old_read_domains = obj-base.read_domains; old_write_domain = obj-base.write_domain; -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/4] drm/i915: Insert a full mb() before reading the seqno from the status page
Hopefully this will reduce a few of the missed IRQ warnings. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c |8 +++- drivers/gpu/drm/i915/intel_ringbuffer.h |2 -- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e069e69..133beb6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -704,14 +704,18 @@ gen6_ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency) /* Workaround to force correct ordering between irq and seqno writes on * ivb (and maybe also on snb) by reading from a CS register (like * ACTHD) before reading the status page. */ - if (!lazy_coherency) + if (!lazy_coherency) { intel_ring_get_active_head(ring); + mb(); + } return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } static u32 ring_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency) { + if (!lazy_coherency) + mb(); return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } @@ -719,6 +723,8 @@ static u32 pc_render_get_seqno(struct intel_ring_buffer *ring, bool lazy_coherency) { struct pipe_control *pc = ring-private; + if (!lazy_coherency) + mb(); return pc-cpu_page[0]; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 2ea7a31..40b252e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -160,8 +160,6 @@ static inline u32 intel_read_status_page(struct intel_ring_buffer *ring, int reg) { - /* Ensure that the compiler doesn't optimize away the load. */ - barrier(); return ring-status_page.page_addr[reg]; } -- 1.7.10.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] scaling on ivy bridge
On Tue, Oct 09, 2012 at 10:40:26AM -0700, ron minnich wrote: I have a display that is, e.g., 2560x1600 on an ivybridge. I'd like to set it up so that the software sees a display with half that resolution, i.e. the software that draws into the frame buffer (this is a bios) can draw into what it sees as a 1280x800 display. I've tried a few things but I'm realizing I don't understand the ivybridge panel fitter at all. This display is on PIPEA. It appears on sandybridge one can write PFA_VSCALE, PFA_HSCALE, PFA_WIN_SZ, and PFA_WIN_POS and _PFA_CTL_1 and it all works. On Ivybridge it seems far more complex. Or am I missing something? I don't see any fitting code in the drm driver itself, so I'm not quite sure how it's done. Thanks in advance, as always. The panel fitter should work the same on ivb as on snb I think. We only expose it for integrated panels though, and if the mode you want is missing you first need to add it. But it should all work ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: fixup i915_gem_object_get_page inline helper
The obj-pages to obj-pages-sgl rework introduced this helper, but it doesn't actually work for n % SG_MAX_SINGLE_ALLOC == 0. This is exercised by the improved hangman tests and the gem_exec_big test in i-g-t. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4f2831a..d3dbd0f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1341,7 +1341,7 @@ int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { struct scatterlist *sg = obj-pages-sgl; - while (n = SG_MAX_SINGLE_ALLOC) { + while (n = SG_MAX_SINGLE_ALLOC - 1) { sg = sg_chain_ptr(sg + SG_MAX_SINGLE_ALLOC - 1); n -= SG_MAX_SINGLE_ALLOC - 1; } -- 1.7.11.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] scaling on ivy bridge
2012/10/9 ron minnich rminn...@gmail.com: I have a display that is, e.g., 2560x1600 on an ivybridge. I'd like to set it up so that the software sees a display with half that resolution, i.e. the software that draws into the frame buffer (this is a bios) can draw into what it sees as a 1280x800 display. I've tried a few things but I'm realizing I don't understand the ivybridge panel fitter at all. This display is on PIPEA. It appears on sandybridge one can write PFA_VSCALE, PFA_HSCALE, PFA_WIN_SZ, and PFA_WIN_POS and _PFA_CTL_1 and it all works. On Ivybridge it seems far more complex. Or am I missing something? I don't see any fitting code in the drm driver itself, so I'm not quite sure how it's done. Take a look at this tool: http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/tree/tools/intel_panel_fitter.c It's not guaranteed to work, but it may serve as a starting point for your implementation. Thanks in advance, as always. ron ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] hsw rps values regress RPS on Macbook Air
On my new MBA with danvet's drm-intel-next-queued, I'm not getting working RPS. vblank_mode=0 glxgears never ups the frequency, and vblank_mode=0 openarena only makes it up to 500mhz. Reverting 1ee9ae3244c4789f3184c5123f3b2d7e405b3f4c gets the machine to responsive RPS: fully on while the GPU is busy, fully lowered when it's not. Since we're always just looking for all-on or all-off and never see workloads that actually want to be somewhere in between, could we please just move to race to idle for RPS? dmi info: Handle 0x001B, DMI type 1, 27 bytes System Information Manufacturer: Apple Inc. Product Name: MacBookAir5,2 Version: 1.0 Serial Number: C02JH14TDRVG UUID: E58AA5BB-95BE-115D-AE6F-E12B41830529 Wake-up Type: Power Switch SKU Number: System SKU# Family: MacBook Air pgpI6sq0BDbOf.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fixup i915_gem_object_get_page inline helper
On Tue, 9 Oct 2012 19:59:02 +0200, Daniel Vetter daniel.vet...@ffwll.ch wrote: The obj-pages to obj-pages-sgl rework introduced this helper, but it doesn't actually work for n % SG_MAX_SINGLE_ALLOC == 0. This is exercised by the improved hangman tests and the gem_exec_big test in i-g-t. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch I had to think about that harder than I should, to be sure we handled the case where the last element was a page and not a chainptr correctly. On hindsight, the code is clearly bogus for the n==SG_MAX_SINGLE_ALLOC because n is an index. My original thought was to use 'nth' instead and I believe that it would have helped prevent my confusion. Ah hindisght. Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] scaling on ivy bridge
On Tue, Oct 9, 2012 at 12:35 PM, Paulo Zanoni przan...@gmail.com wrote: Take a look at this tool: http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/tree/tools/intel_panel_fitter.c very interesting! i note that tool sets WIN_POS and WIN_SZ, but not the HSCALE and VSCALE ... is there something I'm not understanding (yes)? Are HSCALE and VSCALE intended to be read? I'm going to try it of course but I'm confused :-) thanks! ron ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: fixup i915_gem_object_get_page inline helper
The obj-pages to obj-pages-sgl rework introduced this helper, but it doesn't actually work for n = SG_MAX_SINGLE_ALLOC. For simplicity (and since right now I seem to be too stupid to see the bug), let's just grab the right page with a for_each_sg loop. This is exercised by the improved hangman tests and the gem_exec_big test in i-g-t. v2: Compared to v1, don't try to be clever since I seemingly only manage to prove that I'm not clever. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_drv.h | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4f2831a..32a2e47 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1340,12 +1340,14 @@ void i915_gem_lastclose(struct drm_device *dev); int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { - struct scatterlist *sg = obj-pages-sgl; - while (n = SG_MAX_SINGLE_ALLOC) { - sg = sg_chain_ptr(sg + SG_MAX_SINGLE_ALLOC - 1); - n -= SG_MAX_SINGLE_ALLOC - 1; + struct scatterlist *sg; + int i; + + for_each_sg(obj-pages-sgl, sg, obj-pages-nents, i) { + if (i == n) + return sg_page(sg); } - return sg_page(sg+n); + return NULL; } static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { -- 1.7.11.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fixup i915_gem_object_get_page inline helper
On Tue, 9 Oct 2012 22:50:48 +0200, Daniel Vetter daniel.vet...@ffwll.ch wrote: The obj-pages to obj-pages-sgl rework introduced this helper, but it doesn't actually work for n = SG_MAX_SINGLE_ALLOC. For simplicity (and since right now I seem to be too stupid to see the bug), let's just grab the right page with a for_each_sg loop. This is exercised by the improved hangman tests and the gem_exec_big test in i-g-t. v2: Compared to v1, don't try to be clever since I seemingly only manage to prove that I'm not clever. Only I expect that loop to show up on profiles even higher than the sg_next() from pwrite. :| I expect it to have a measureable impact upon relocation throughput, so I should measure it... -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx