Re: R200 ReadPixels optimization

2004-10-18 Thread Dieter Nützel
Am Dienstag, 12. Oktober 2004 20:24 schrieb Ian Romanick: Dieter Nützel wrote: NONE of your three versions gave me direct rendering?! I've tested with and without your TLS-patch (progress?). The symbols are in. DRI-Mesa/Patches nm /usr/X11R6-NO-TLS/lib/modules/dri/r200_dri.so | grep

Re: R200 ReadPixels optimization

2004-10-12 Thread Ian Romanick
Dieter Nützel wrote: NONE of your three versions gave me direct rendering?! I've tested with and without your TLS-patch (progress?). The symbols are in. DRI-Mesa/Patches nm /usr/X11R6-NO-TLS/lib/modules/dri/r200_dri.so | grep r200ReadRGBASpan_ARGB 00175714 t r200ReadRGBASpan_ARGB 00175be4

Re: R200 ReadPixels optimization

2004-10-09 Thread Dieter Ntzel
Am Samstag, 9. Oktober 2004 03:33 schrieb Ian Romanick: Ian Romanick wrote: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core

Re: R200 ReadPixels optimization

2004-10-08 Thread Ian Romanick
Ian Romanick wrote: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core MMX SSE2 routines can be used for other cards as well. For

Re: R200 ReadPixels optimization

2004-10-08 Thread Ian Romanick
Marcello Maggioni wrote: I experience a great slowdown in using this patch . [EMAIL PROTECTED]:~/driconf-0.2.2$ glxgears Mesa: software DXTn compression/decompression available Using MMX version of ReadRGBASpan 27 frames in 5.1 seconds = 5.320 FPS 25 frames in 5.0 seconds = 4.982 FPS [EMAIL

Re: R200 ReadPixels optimization

2004-10-08 Thread Marcello Maggioni
On Fri, 08 Oct 2004 15:12:51 -0700, Ian Romanick [EMAIL PROTECTED] wrote: Marcello Maggioni wrote: I experience a great slowdown in using this patch . [EMAIL PROTECTED]:~/driconf-0.2.2$ glxgears Mesa: software DXTn compression/decompression available Using MMX version of ReadRGBASpan

Re: R200 ReadPixels optimization

2004-10-08 Thread Ian Romanick
Ian Romanick wrote: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core MMX SSE2 routines can be used for other cards as well. For

Re: R200 ReadPixels optimization

2004-10-07 Thread Keith Whitwell
Alan Cox wrote: On Mer, 2004-10-06 at 22:02, Ian Romanick wrote: Here's my question. Is there any way to trick it into doing back-to-back reads as a single PCI transfer? So, if I did something like: Not that anyone has found. I'm not sure PCI even really allows it except for prefetchable

Re: R200 ReadPixels optimization

2004-10-07 Thread Alan Cox
Note that there's some code in there already which uses the blitter to copy from framebuffer to agp memory, though it tries to implement the entire readpixels() operation rather than being a useful low-level operation. AGP memory is hostside uncached (CPU limitations on x86 for one) which

Re: R200 ReadPixels optimization

2004-10-07 Thread Ville Syrjälä
On Thu, Oct 07, 2004 at 02:02:38PM +0100, Alan Cox wrote: Note that there's some code in there already which uses the blitter to copy from framebuffer to agp memory, though it tries to implement the entire readpixels() operation rather than being a useful low-level operation. AGP memory

Re: R200 ReadPixels optimization / AGP

2004-10-07 Thread Alan Cox
On Iau, 2004-10-07 at 15:40, Ville Syrjl wrote: Why can't we make AGP memory cached? Wouldn't it be enought to flush the caches at some critical points? Possibly although it is not trivial to see how we get that right, especially with the 4Mb kernel maps. The x86 processor cannot handle a page

Re: R200 ReadPixels optimization / AGP

2004-10-07 Thread Keith Whitwell
Alan Cox wrote: On Iau, 2004-10-07 at 15:40, Ville Syrjl wrote: Why can't we make AGP memory cached? Wouldn't it be enought to flush the caches at some critical points? Possibly although it is not trivial to see how we get that right, especially with the 4Mb kernel maps. The x86 processor cannot

Re: R200 ReadPixels optimization

2004-10-07 Thread Vladimir Dergachev
On Wed, 6 Oct 2004, Eric Anholt wrote: On Wed, 2004-10-06 at 09:33, Vladimir Dergachev wrote: On Wed, 6 Oct 2004, Dieter [iso-8859-15] Nützel wrote: Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels

Re: R200 ReadPixels optimization

2004-10-07 Thread Mike Mestnik
--- Ville Syrjälä [EMAIL PROTECTED] wrote: On Thu, Oct 07, 2004 at 02:02:38PM +0100, Alan Cox wrote: Note that there's some code in there already which uses the blitter to copy from framebuffer to agp memory, though it tries to implement the entire readpixels() operation rather

Re: R200 ReadPixels optimization

2004-10-06 Thread Dieter Ntzel
Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core MMX SSE2 routines can be

Re: R200 ReadPixels optimization

2004-10-06 Thread Vladimir Dergachev
On Wed, 6 Oct 2004, Dieter [iso-8859-15] Nützel wrote: Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The

Re: R200 ReadPixels optimization

2004-10-06 Thread Ian Romanick
Vladimir Dergachev wrote: Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core MMX

Re: R200 ReadPixels optimization

2004-10-06 Thread Eric Anholt
On Wed, 2004-10-06 at 09:33, Vladimir Dergachev wrote: On Wed, 6 Oct 2004, Dieter [iso-8859-15] Nützel wrote: Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured

Re: R200 ReadPixels optimization

2004-10-06 Thread Alan Cox
On Mer, 2004-10-06 at 16:56, Dieter Ntzel wrote: What about MMX2, 3DNow, 3DNow2 (pro), SSE (1)? It would be nice if we have this like MPlayer: Soreen wrote a set of routines for this that are in Xorg 6.8.* and optimise the readback of video memory for render operations - naturally enough they

Re: R200 ReadPixels optimization

2004-10-06 Thread Ian Romanick
Alan Cox wrote: On Mer, 2004-10-06 at 16:56, Dieter Ntzel wrote: What about MMX2, 3DNow, 3DNow2 (pro), SSE (1)? It would be nice if we have this like MPlayer: Soreen wrote a set of routines for this that are in Xorg 6.8.* and optimise the readback of video memory for render operations - naturally

Re: R200 ReadPixels optimization

2004-10-06 Thread Alan Cox
On Mer, 2004-10-06 at 19:36, Ian Romanick wrote: from video RAM to system RAM. It has to convert the pixel data from its native, on-card format to RGBA. In the case of my patch, it converts from BGRA to RGBA while doing the copy. That's why it needs the SSE2 shift instructions. From

Re: R200 ReadPixels optimization

2004-10-06 Thread Ian Romanick
Alan Cox wrote: On Mer, 2004-10-06 at 19:36, Ian Romanick wrote: from video RAM to system RAM. It has to convert the pixel data from its native, on-card format to RGBA. In the case of my patch, it converts from BGRA to RGBA while doing the copy. That's why it needs the SSE2 shift

Re: R200 ReadPixels optimization

2004-10-06 Thread Alan Cox
On Mer, 2004-10-06 at 22:02, Ian Romanick wrote: Here's my question. Is there any way to trick it into doing back-to-back reads as a single PCI transfer? So, if I did something like: Not that anyone has found. I'm not sure PCI even really allows it except for prefetchable memory. Except of

Re: R200 ReadPixels optimization

2004-10-06 Thread Ian Romanick
Dieter Nützel wrote: Am Mittwoch, 6. Oktober 2004 03:52 schrieb Ian Romanick: Here's a simple patch that gives about a 50% (on my box) speed boost to glReadPixels performance in 24-bit. I measured using the benchmark built into progs/demos/readpix. The interesting thing is that the core MMX