On Saturday 09 September 2006 09:47, Attila Kinali wrote:
> On Sat, 09 Sep 2006 09:18:34 +0200
>
> Raphael Jacquot <[EMAIL PROTECTED]> wrote:
> > Attila Kinali wrote:
> > > And i wouldn't impose that onto the CPU. The Matrox G200 used
> > > and interleaved YUV format, which mean that the player software
> > > had to first convert the planar data into a interleaved format
> > > and then showel it over to the card. In the case of MPlayer this
> > > made a 10-20% performance los for the whole player (comparing
> > > the G200 to the G400), which means that for the transfere the
> > > los must have been somewhere between 50% and 80%.
> >
> > imho, the thing should accept both planar and interleaved
> > professionnal systems use interleaved, as they work from streams
>
> Anyways, for performance reasons (in video applications)
> it's important that we can handle planar data in hardware.

Then we will have to do the rearranging in video memory, which means we 
do get YUV data on the card (but if we're clever the card doesn't need 
to know). And the YUV->RGB conversion will need to be somewhere where 
it can work on YUV data in video memory, not tacked onto the DMA 
engine.

A VGA card is really just a frame buffer with (amongst others) hardware 
conversion for certain kinds of image formats. YUV is also an image 
format, and we also want it converted in hardware.

We already have some special hardware that implements the VGA image 
format conversions: it converts from the VGA character/attribute format 
into RGBA. We've called it the VGA nanocontroller, but perhaps it 
should be called the format conversion processor (FCP). If YUV is just 
another image format that needs to be converted, then it seems logical 
that the FCP should do that.

And it's practical as well. Nobody expects to do video overlays in VGA 
text mode, so there's no problem in reusing the hardware. The address 
scanning logic is already there. Fetching four bytes instead of two 
shouldn't be a problem (actually, isn't the memory controller 32-bit 
anyway?), and in fact, given the geometry of VGA text modes (no odd 
widths) the FCP could always fetch four bytes at a time, and either 
interpret them as a single YUVA pixel or as two character/attribute 
pairs. In case of YUVA it could overwrite the orginal data, but that's 
just a matter of programming the output address correctly.

There is the question of speed. We don't care much about the text mode, 
but YUV conversion needs to be quick enough to keep up with video. Am I 
correct in assuming that it's the character/attribute->RGBA conversion 
that's slowing down the VGA nanocontroller, and that the addressing 
logic could keep up if we wanted it to?


As for the planar to packed conversion, does the DMA engine support 
8-bit grayscale? Does it automatically expand that to RGBA upon 
transfer from host memory? And can the texture engine mask some of the 
channels when it writes to a surface?

If so, we can upload each of the planes to the card as an 8-bit 
grayscale surface. So source planes

Y0 Y1 Y2 Y3
Y4 Y5 Y6 Y7
Y8 Y9 Ya Yb
Yc Yd Ye Yf

U0 U1
U2 U3

V0 V1
V2 V3

get uploaded as

r  g  b  a  r  g  b  a  r  g  b  a  r  g  b  a
Y0 Y0 Y0 ** Y1 Y1 Y1 ** Y2 Y2 Y2 ** Y3 Y3 Y3 **
Y4 Y4 Y4 ** Y5 Y5 Y5 ** Y6 Y6 Y6 ** Y7 Y7 Y7 **
Y8 Y8 Y8 ** Y9 Y9 Y9 ** Ya Ya Ya ** Yb Yb Yb **
Yc Yc Yc ** Yd Yd Yd ** Ye Ye Ye ** Yf Yf Yf **

U0 U0 U0 ** U1 U1 U1 **
U2 U2 U2 ** U3 U3 U3 **

V0 V0 V0 ** V1 V1 V1 **
V2 V2 V2 ** V3 V3 V3 **

(asterisks are alpha values, which I would assume are all 0xff)

We can then use the texture engine (which thinks it's RGBA it's working 
with, but that's fine) to render the U surface on top of the Y surface, 
masking out all but the green channel, and the same for the V surface 
masking out all but the blue channel. We'll even have interpolation of 
U and V from the texture engine if we want it. That yields a packed YUV 
surface, which can then be converted to RGB by the FCP and composed 
with everything else by the drawing engine.

Again, the question is if it's fast enough.


Incidentally, being able to mask channels in the texture engine would be 
nice for another reason. Take a high-res bitmap and render it onto the 
frame buffer three times. Each time with a slightly different 
horizontal shift, and with a different channel enabled. Hardware 
accelerated subpixel font rendering anyone?

Lourens

Attachment: pgpAWkMWrGtJE.pgp
Description: PGP signature

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to