Re: [Intel-gfx] [PATCH 0/9] [RFC] fair-lru eviction

2010-05-19 Thread Chris Wilson
On Tue, 18 May 2010 23:11:42 +0200, Daniel Vetter daniel.vet...@ffwll.ch 
wrote:
 Hi all,
 
 This patch series implements the fair-lru eviction Chris Wilson already
 posted with a twist. It's essentially the same idea  algorithm.
 Differnences versus his patch:
 - Doesn't do any allocations while scanning.
 - Implemented in drm_mm.c
 
 In other words, this should also be usable by ttm. The idea is simple:
 Scan through the lru, marking objects as evictable until there is a
 large area of memory free/free-able. Then walk through all the scanned
 objects in reverse, checking which ones fall into this hole. Finally
 evicting them.
 
 Comments, ideas highly welcome.

The next adaptation I did was to clean up evict_something to add objects
from the inactive, active!pinned!write, flushing!pinned,
active!pinnedwrite lists. This reduces the logic in evict_something to
a single scan over the available objects in LRU order.

We still need the move-to-inactive-tail upon access by the CPU, and I
think it is acceptable to maintain our preference of the GPU over the CPU.
Recovering memory from the CPU is comparatively cheap.

Comparing 'while :; do yes  /tmp/yes; done  cairo-perf-trace', there is
no significant delta between the fair LRU and current. I'll rebase my
evict_something() on top of your drm_mm, and rerun the tests.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/4] introduce intel_ring_buffer structure

2010-05-19 Thread Simon Farnsworth
On Wednesday 19 May 2010, Zou, Nanhai nanhai@intel.com wrote:
   Currently we do not find any regression or slowness. We have been 
 testing
 full HD video test along with regression tests for more than 1 month. The
 only slowness we find now is with playing 2 1080p H.264 video. That was
 caused by too much GPU memory usage which is not related to the interface.
 
And are you utterly confident that if anyone finds a way to use your new ioctl 
to (e.g.) hang the GPU, stall rendering indefinitely waiting for media engine 
events that will never happen, or perhaps bypass system security, you can fix 
it without changing the ioctl interface to userspace? If not, you have a 
problem, and need to redesign the ioctl.

Bear in mind that I don't have to use VAAPI to call your ioctl; I can write 
evil code that calls it directly. On the other hand, you can't remove the 
ioctl later - users will want to use older VAAPI versions with new kernels, so 
that they can upgrade without too much fear of regression.
-- 
Simon Farnsworth
Software Engineer
ONELAN Limited
http://www.onelan.com/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 0/3] implement multiple ring buffer V1

2010-05-19 Thread Zou Nan hai
 The patch series try to abstruct ring buffer
 structure, implement BSD (bit stream decoder) ring
 buffer for H.264/VC1 VLD decoding.

 I mark this as V1 for review.



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/3] introduce intel_ring_buffer structure V1

2010-05-19 Thread Zou Nan hai
introduce intel_ring_buffer structure, convert render ring buffer to
use the structure.

Signed-off-by: Zou Nan hai nanhai@intel.com
Signed-off-by: Xiang Hai hao haihao.xi...@intel.com
---
 drivers/gpu/drm/i915/Makefile   |1 +
 drivers/gpu/drm/i915/i915_debugfs.c |8 +-
 drivers/gpu/drm/i915/i915_dma.c |  153 ++--
 drivers/gpu/drm/i915/i915_drv.c |   29 +--
 drivers/gpu/drm/i915/i915_drv.h |   64 ++--
 drivers/gpu/drm/i915/i915_gem.c |  298 +-
 drivers/gpu/drm/i915/i915_irq.c |   21 +-
 drivers/gpu/drm/i915/intel_display.c|1 -
 drivers/gpu/drm/i915/intel_overlay.c|8 -
 drivers/gpu/drm/i915/intel_ringbuffer.c |  670 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  124 ++
 include/drm/i915_drm.h  |4 +-
 12 files changed, 898 insertions(+), 483 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_ringbuffer.c
 create mode 100644 drivers/gpu/drm/i915/intel_ringbuffer.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9563901..da78f2c 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -22,6 +22,7 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o i915_mem.o \
  intel_fb.o \
  intel_tv.o \
  intel_dvo.o \
+ intel_ringbuffer.o \
  intel_overlay.o \
  dvo_ch7xxx.o \
  dvo_ch7017.o \
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 322070c..4fddf09 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -317,14 +317,14 @@ static int i915_ringbuffer_data(struct seq_file *m, void 
*data)
u8 *virt;
uint32_t *ptr, off;
 
-   if (!dev_priv-ring.ring_obj) {
+   if (!dev_priv-render_ring.gem_object) {
seq_printf(m, No ringbuffer setup\n);
return 0;
}
 
-   virt = dev_priv-ring.virtual_start;
+   virt = dev_priv-render_ring.virtual_start;
 
-   for (off = 0; off  dev_priv-ring.Size; off += 4) {
+   for (off = 0; off  dev_priv-render_ring.size; off += 4) {
ptr = (uint32_t *)(virt + off);
seq_printf(m, %08x :  %08x\n, off, *ptr);
}
@@ -344,7 +344,7 @@ static int i915_ringbuffer_info(struct seq_file *m, void 
*data)
 
seq_printf(m, RingHead :  %08x\n, head);
seq_printf(m, RingTail :  %08x\n, tail);
-   seq_printf(m, RingSize :  %08lx\n, dev_priv-ring.Size);
+   seq_printf(m, RingSize :  %08lx\n, dev_priv-render_ring.size);
seq_printf(m, Acthd : %08x\n, I915_READ(IS_I965G(dev) ? 
ACTHD_I965 : ACTHD));
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 851a2f8..1e52693 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -40,84 +40,6 @@
 #include linux/vga_switcheroo.h
 #include linux/slab.h
 
-/* Really want an OS-independent resettable timer.  Would like to have
- * this loop run for (eg) 3 sec, but have the timer reset every time
- * the head pointer changes, so that EBUSY only happens if the ring
- * actually stalls for (eg) 3 seconds.
- */
-int i915_wait_ring(struct drm_device * dev, int n, const char *caller)
-{
-   drm_i915_private_t *dev_priv = dev-dev_private;
-   drm_i915_ring_buffer_t *ring = (dev_priv-ring);
-   u32 acthd_reg = IS_I965G(dev) ? ACTHD_I965 : ACTHD;
-   u32 last_acthd = I915_READ(acthd_reg);
-   u32 acthd;
-   u32 last_head = I915_READ(PRB0_HEAD)  HEAD_ADDR;
-   int i;
-
-   trace_i915_ring_wait_begin (dev);
-
-   for (i = 0; i  10; i++) {
-   ring-head = I915_READ(PRB0_HEAD)  HEAD_ADDR;
-   acthd = I915_READ(acthd_reg);
-   ring-space = ring-head - (ring-tail + 8);
-   if (ring-space  0)
-   ring-space += ring-Size;
-   if (ring-space = n) {
-   trace_i915_ring_wait_end (dev);
-   return 0;
-   }
-
-   if (dev-primary-master) {
-   struct drm_i915_master_private *master_priv = 
dev-primary-master-driver_priv;
-   if (master_priv-sarea_priv)
-   master_priv-sarea_priv-perf_boxes |= 
I915_BOX_WAIT;
-   }
-
-
-   if (ring-head != last_head)
-   i = 0;
-   if (acthd != last_acthd)
-   i = 0;
-
-   last_head = ring-head;
-   last_acthd = acthd;
-   msleep_interruptible(10);
-
-   }
-
-   trace_i915_ring_wait_end (dev);
-   return -EBUSY;
-}
-
-/* As a ringbuffer is only allowed to wrap between instructions, fill
- * the tail with NOOPs.
- */
-int i915_wrap_ring(struct drm_device *dev)
-{
-   drm_i915_private_t *dev_priv = dev-dev_private;
-   

Re: [Intel-gfx] [PATCH 3/9] drm: kill drm_mm_node-private

2010-05-19 Thread Daniel Vetter
On Wed, May 19, 2010 at 11:25:07AM +0200, Jerome Glisse wrote:
 On Tue, May 18, 2010 at 11:11:45PM +0200, Daniel Vetter wrote:
  Only ever assigned, never used.
  
  Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 
 NAK
 
 private was to be use when doing range restricted allocation
 somehow the patch that use it was drop/forgot/lost along the
 was i will try to see i have it and redo it if not.

I don't agree for the following reasons:
1) drm_mm _does_ implement range-restricted allocations. And it does not
   use the private pointer to do so. My new scanning algorithm doesn't
   implement this (i915 doesn't use range restricted allocations), but
   it's damn trivial to add.
2) The private pointer was used as a back-pointer to the object. If
   something like this is needed, making struct drm_mm_node embedable
   looks like the right approach (perhaps with some driver-private
   bitfields to distinguish different case). I'm still in the process of
   shooting down the driver_private gem_object pointer and I don't like
   doing this right away again ...

Can you please point me to the code that needs this private pointer? Then I
can see in which way I'm wrong ... ;)

 Cheers,
 Jerome

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/3] implement multiple ring buffer V1

2010-05-19 Thread Daniel Vetter
On Wed, May 19, 2010 at 05:33:28PM +0800, Zou Nan hai wrote:
  The patch series try to abstruct ring buffer
  structure, implement BSD (bit stream decoder) ring
  buffer for H.264/VC1 VLD decoding.
 
  I mark this as V1 for review.

Woot, we're (slowly) getting there. I'll do an in-depth review some when
later this week (running short on free time, atm). On a quick look, the
series has definitely improved.

One nitpick up-front: You haven't sent the individual patches as follow-ups
to the introduction email. Just use git format-patch and git send-email,
they'll take care of all the boring details for you. Quick primer, if
you're totally lazy:

$ git format-patch --cover-letter -o output_directory revision_range

creates a bunch of .patch files including introduction letter with the
commits int the given range. You only have to change the subject  text of
the cover letter. Then

$ git send-email --to 'b...@bla.com' --cc 'f...@bar.com' *.patch [--dry-run]

sends them away. Use --dry-run to check first that git does the right
thing with your patches.

Yours, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/4] introduce intel_ring_buffer structure

2010-05-19 Thread Simon Farnsworth
On Wednesday 19 May 2010, Eric Anholt eric.anh...@intel.com wrote:
 On Wed, 19 May 2010 10:00:10 +0100, Simon Farnsworth 
simon.farnswo...@onelan.com wrote:
  Bear in mind that I don't have to use VAAPI to call your ioctl; I can
  write evil code that calls it directly. On the other hand, you can't
  remove the ioctl later - users will want to use older VAAPI versions
  with new kernels, so that they can upgrade without too much fear of
  regression.
 
 OK, now you're going over the top.  The GPU can't schedule[1], is turing
 complete, and you hand it programs.  You can hang it with or without his
 interface.  If you don't want to be able to do that, then don't let
 non-root use the DRI.
 
Fair enough, and my apologies for going overboard - my intended point is that 
just because VAAPI is the only planned user of this ioctl doesn't mean that 
it's OK to get the userspace/kernel interface wrong.
-- 
Simon Farnsworth
Software Engineer
ONELAN Limited
http://www.onelan.com/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx