On 9/9/06, Attila Kinali <[EMAIL PROTECTED]> wrote:
I wouldn't do that. It really costs a lot of performance to do this thing on the CPU. IMHO the easiest way to implement the YUV->RGB conversion including upsampling is to store all 3 planes in video memory, then using a counter with programmable dividers for the adressing of the image data. This would need 3 11bit-12bit counters (plus-one adder + feedback register) for the addressing and two small, maybe just 2bit counters for the dividers. With that you can easily feed a YUV->RGB converter with the data it needs and then either pass it on directly into some processing block or put it back into memory. The only problem with that approach is that each pixel needs 3 read accesses (and one write if it's stored back), so there need to be some form of cache to allow the RAM to work in burst mode.
Doing this read/modify/write to RAM is vastly more complicated than you seem to think. Reading memory is not simple. If requires a request fifo and a response fifo and additional logic in the hub for the memory controller. Counting addresses is nothing by comparison. If it's absolutely necessary to do it this way, I suggest we feed the data for this through the drawing engine and use the Blender unit to handle this case. I was already planning to add this option. It has the advantage of being able to, for zero cost, apply alpha blending, rops, and planemasks directly to image uploads. Essentially, a graphics memory write can be diverted through the drawing engine where it becomes a fragment and can therefore have any drawing engine feature applied to it. (There are limits for cases where the fragment has an address but not coordinates.) The drawback is that writes have a huge latency... if you ever want to read the word back, you have to know what you did and flush the engine pipeline before you try to read it back. We can move the YUV/RGB logic into the engine where we can send YUYV for one scanline, then change modes (just drop a configuration write down the pipeline) and provide an offset where we provide YYYY and have the GPU read memory from an appropriate offset back from where we're writing. You could alternate YUYV and YYYY, or you could do all YUYV at once and then interleave the YYYY in there. In this case, since we're storing as RGB, we'd have to sneak U and V into the alpha channel bits of image being uploaded. So alternating pixels would be stored as URGB and then VRGB. If you want to apply an alpha blend to the video data when it's being composited onto the screen, you can provide it as a constant in the texture unit. A point not to be lost here is that we're not wasting any memory bandwidth by storing unnecessary YUV in the framebuffer. It's an on-the-fly conversion, and it's one-way. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
