On Saturday 09 September 2006 09:47, Attila Kinali wrote: > On Sat, 09 Sep 2006 09:18:34 +0200 > > Raphael Jacquot <[EMAIL PROTECTED]> wrote: > > Attila Kinali wrote: > > > And i wouldn't impose that onto the CPU. The Matrox G200 used > > > and interleaved YUV format, which mean that the player software > > > had to first convert the planar data into a interleaved format > > > and then showel it over to the card. In the case of MPlayer this > > > made a 10-20% performance los for the whole player (comparing > > > the G200 to the G400), which means that for the transfere the > > > los must have been somewhere between 50% and 80%. > > > > imho, the thing should accept both planar and interleaved > > professionnal systems use interleaved, as they work from streams > > Anyways, for performance reasons (in video applications) > it's important that we can handle planar data in hardware.
Then we will have to do the rearranging in video memory, which means we do get YUV data on the card (but if we're clever the card doesn't need to know). And the YUV->RGB conversion will need to be somewhere where it can work on YUV data in video memory, not tacked onto the DMA engine. A VGA card is really just a frame buffer with (amongst others) hardware conversion for certain kinds of image formats. YUV is also an image format, and we also want it converted in hardware. We already have some special hardware that implements the VGA image format conversions: it converts from the VGA character/attribute format into RGBA. We've called it the VGA nanocontroller, but perhaps it should be called the format conversion processor (FCP). If YUV is just another image format that needs to be converted, then it seems logical that the FCP should do that. And it's practical as well. Nobody expects to do video overlays in VGA text mode, so there's no problem in reusing the hardware. The address scanning logic is already there. Fetching four bytes instead of two shouldn't be a problem (actually, isn't the memory controller 32-bit anyway?), and in fact, given the geometry of VGA text modes (no odd widths) the FCP could always fetch four bytes at a time, and either interpret them as a single YUVA pixel or as two character/attribute pairs. In case of YUVA it could overwrite the orginal data, but that's just a matter of programming the output address correctly. There is the question of speed. We don't care much about the text mode, but YUV conversion needs to be quick enough to keep up with video. Am I correct in assuming that it's the character/attribute->RGBA conversion that's slowing down the VGA nanocontroller, and that the addressing logic could keep up if we wanted it to? As for the planar to packed conversion, does the DMA engine support 8-bit grayscale? Does it automatically expand that to RGBA upon transfer from host memory? And can the texture engine mask some of the channels when it writes to a surface? If so, we can upload each of the planes to the card as an 8-bit grayscale surface. So source planes Y0 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Ya Yb Yc Yd Ye Yf U0 U1 U2 U3 V0 V1 V2 V3 get uploaded as r g b a r g b a r g b a r g b a Y0 Y0 Y0 ** Y1 Y1 Y1 ** Y2 Y2 Y2 ** Y3 Y3 Y3 ** Y4 Y4 Y4 ** Y5 Y5 Y5 ** Y6 Y6 Y6 ** Y7 Y7 Y7 ** Y8 Y8 Y8 ** Y9 Y9 Y9 ** Ya Ya Ya ** Yb Yb Yb ** Yc Yc Yc ** Yd Yd Yd ** Ye Ye Ye ** Yf Yf Yf ** U0 U0 U0 ** U1 U1 U1 ** U2 U2 U2 ** U3 U3 U3 ** V0 V0 V0 ** V1 V1 V1 ** V2 V2 V2 ** V3 V3 V3 ** (asterisks are alpha values, which I would assume are all 0xff) We can then use the texture engine (which thinks it's RGBA it's working with, but that's fine) to render the U surface on top of the Y surface, masking out all but the green channel, and the same for the V surface masking out all but the blue channel. We'll even have interpolation of U and V from the texture engine if we want it. That yields a packed YUV surface, which can then be converted to RGB by the FCP and composed with everything else by the drawing engine. Again, the question is if it's fast enough. Incidentally, being able to mask channels in the texture engine would be nice for another reason. Take a high-res bitmap and render it onto the frame buffer three times. Each time with a slightly different horizontal shift, and with a different channel enabled. Hardware accelerated subpixel font rendering anyone? Lourens
pgpAWkMWrGtJE.pgp
Description: PGP signature
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
