Hello,

Here's a quick overview of the problem i'm experiencing, since a lot
of the stuff involved is outside of my area of expertise i would
highly appreciate any feedback.

I'm using Mesa to view video due to lack of PBO the best performance
can only be achieved with glXAllocateMemoryMESA + YCBCR. Yesterday i
augmented the driver for my video4linux2 compatible capture board with
USERPTR support, i.e. i can supply an arbitrary user space pointer and
the diver will create kernel pages to map this area, pin down the
pages in memory and put the video there without much overhead.

This works quite nicely if i do something like (rough pseudo-code):

agpmem = glXAllocateMemoryMESA (...);
data = malloc (frame_size * num_buffers);
queue_buffers (data, num_buffers);
for (;;) {
      buffer = dequeue_buffer ();
      copy (agmem, buffer.data, frame_size);
      glTexImage2D (..., agpmem);
      draw_quad ();
      queue_buffer (buffer);
}

All the constraints for fast-pathed texture uploading are satisfied
(alignment/format/types, output of R200_DEBUG=tex also agrees).

What i get is not so bad cpu load when streaming fairly large video
frames (4CIF/PAL - 704x576, 16 bits average per YUV pixel)

However there's redundant copy there, it should theoretically be
possible to get the driver to put the video directly into `agpmem'.
And indeed it works and performance is great, but there's a big catch,
the video is distorted - there are horizontal lines making some kind
of a weird feedback effect around moving objects. No amount of
tweaking the order, adding glFlush/Finish and other hackery improves
the situation to the point of not having any artifacts at all. Also
it appears that those artifacts are already there before Mesa has a
chance to mess with the memory (observed by writing out the agpmem
to a file before feeding it to TexImage)

The grabber board is just usual USB device and frames are captured
via isochronous transfers (so there's a lot of small data movements
spread over time instead of one large burst)

The driver does nothing in particular with USERPTR grabbing, in other
words there's exactly one code path for any memory be it AGP or not.
`get_user_pages kernel' function is called on the user supplied pointer
and then this area is populated with memcpy from isochronous irq.

The performance boost this `zero-copy' approach provides is not easy
to give up (especially on this not so fast machine)

$ uname -a
Linux linmac 2.6.20.1-exp #4 Wed Mar 7 03:49:49 MSK 2007 ppc GNU/Linux

$ glxinfo | sed -n -e 1,4p -e /renderer/p
Mesa: Mesa 6.5.2 DEBUG build Feb 28 2007 14:49:17
Mesa warning: couldn't open libtxc_dxtn.so, software DXTn 
compression/decompression unavailable
name of display: :0.0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: SGI
OpenGL renderer string: Mesa DRI R200 20060602 AGP 1x PowerPC/Altivec TCL

Once again, any help will be highly appreciated.

-- 
vale

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to