On 9/9/06, Attila Kinali <[EMAIL PROTECTED]> wrote:

I wouldn't do that. It really costs a lot of performance
to do this thing on the CPU. IMHO the easiest way to implement
the YUV->RGB conversion including upsampling is to store all
3 planes in video memory, then using a counter with programmable
dividers for the adressing of the image data. This would need
3 11bit-12bit counters (plus-one adder + feedback register)
for the addressing and two small, maybe just 2bit counters for
the dividers. With that you can easily feed a YUV->RGB converter
with the data it needs and then either pass it on directly into
some processing block or put it back into memory.

The only problem with that approach is that each pixel needs
3 read accesses (and one write if it's stored back), so there need
to be some form of cache to allow the RAM to work in burst mode.

Doing this read/modify/write to RAM is vastly more complicated than
you seem to think.  Reading memory is not simple.  If requires a
request fifo and a response fifo and additional logic in the hub for
the memory controller.  Counting addresses is nothing by comparison.

If it's absolutely necessary to do it this way, I suggest we feed the
data for this through the drawing engine and use the Blender unit to
handle this case.  I was already planning to add this option.

It has the advantage of being able to, for zero cost, apply alpha
blending, rops, and planemasks directly to image uploads.
Essentially, a graphics memory write can be diverted through the
drawing engine where it becomes a fragment and can therefore have any
drawing engine feature applied to it.  (There are limits for cases
where the fragment has an address but not coordinates.)  The drawback
is that writes have a huge latency... if you ever want to read the
word back, you have to know what you did and flush the engine pipeline
before you try to read it back.

We can move the YUV/RGB logic into the engine where we can send YUYV
for one scanline, then change modes (just drop a configuration write
down the pipeline) and provide an offset where we provide YYYY and
have the GPU read memory from an appropriate offset back from where
we're writing.  You could alternate YUYV and YYYY, or you could do all
YUYV at once and then interleave the YYYY in there.

In this case, since we're storing as RGB, we'd have to sneak U and V
into the alpha channel bits of image being uploaded.  So alternating
pixels would be stored as URGB and then VRGB.  If you want to apply an
alpha blend to the video data when it's being composited onto the
screen, you can provide it as a constant in the texture unit.

A point not to be lost here is that we're not wasting any memory
bandwidth by storing unnecessary YUV in the framebuffer.  It's an
on-the-fly conversion, and it's one-way.
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to