Good evening, On Thu, 19 Jul 2007 03:05:56 -0700 James Richard Tyrer <[EMAIL PROTECTED]> wrote:
> > Doesn't make sense. The video is generated at the RAMDAC, > > there we need already a scaled version. But as we have scalers > > anyways for 3D we can recycle those (just map 2D data into > > a 3D object and scale it)- > > Duh? You know video like the output from the MPEG decoder. That needs > YUV to RGB and then it needs to be scaled to fit whatever size window it > is being displayed in. Yes, but we have to integrate the video we output in the output of the other stuff of the graphics card. Thus it's easier to use the frame buffer of OGA for the output. And if we use that, we can also use the scaler in OGA. [YUV->RGB conversion] > > Compared to most 3D operation it's even computationally cheap. > > It's just 9 multiplications and 6 additions in the generall case > > The general case is to multiply the RGB 3 vector by a 3x3 matrix and add > a 3 vector constant. Yes, that is 9 multiplications but if we use a 2 > input adder, it is going to take more additions. The dot product takes > 6 and you need 3 more for the constant. Yes, that's the most general case. But we can restrict ourselves to the one YUV->RGB conversion that is generally used and thus go down to the 7 multiplication + 7 additions formula. BTW i always calculate with a 2 input adder, because any multi-input adder needs a different internal structure to be efficient and has to be optimized to the specific application. > > and if we limit us to one YUV->RGB formula than it's 7 multiplications > > and 7 additions. Additionally, this is something that can be easily > > pipelined so we should be able to spit out one converted sample > > per clock cycle per pipline. > > The only question is the bandwidth of the video data. That will > determine how parallel the operations need to be, and how much hardware > is needed. For the colorspace conversion, that is not a problem. We can stick it into somewhere where we have to pass the data around anyways. Ie either when it is written to RAM or read from RAM and transfered to the frame buffer. Thus no additional memory bandwidth consumption is generated. What i'm more worried about is motion compensation (lots of reads from (half) random locations), h.264 iDCT (uses inter frame prediction). > >>> Upsampling from 4:2:2 to 4:4:4 is nothing difficult either (simple > >>> FIR filter operating on scanlines), but the upsampling from 4:2:0 to > >>> 4:2:2 is (upsampling in vertical direction, guess why they don't do > >>> it). > >> Perhaps it is because it isn't needed. Which decoders output 4:2:0? > > > > Uhmm.. MPEG1, MPEG2, MPEG4, H.264,..... 4:2:0 is the defacto > > standard for video subsampling. The only case i know that regularly > > uses 4:2:2 are video cameras because it is a lot easier to implement. > > I think that I am missing something here. Why does it matter if 4:2:0 > or 4:2:2 chroma sampling method is used since the decoder is going to > output YUV data for each 2 pixels for each line, isn't it? If so, then > the decoder converts it from 4:2:0 to 4:2:2. What decoders are you talking about? Any example of an video decoder that accepts 4:2:0 encoded video and outputs 4:2:2 decoded video? I only know software video decoders and those don't do a 4:2:0 to 4:2:2 up conversion unless explicitly asked to do so. > >>> Deinterlacing is something that cannot be really done without > >>> some information from video source (or some assumptions on the video > >>> output device) and if only a simple implementation is used (which i > >>> assume), then the quality will suck. > >> Are we talking about motion compensating deinterlacing (for sources > >> originally shot with an interlaced video camera and recorded on tape)? > > > > No, we are talking about plain deinterlacing without any motion > > compensation. The one needed if you display interlaced content > > produced for TV consumption on a progressive display. > > Plain deinterlacing is just writes to and reads from memory. I don't > know how that could "suck" The algorithm you apply between the read and the write. > Or, perhaps you are referring to line doubling by vertical interpolation. NO! that's the most awfull way to deinterlace. > >> or just cine deinterlacing where all that is needed is to rearrange the > >> fields from 3:2 pull down so that you always display an odd and an even > >> field from the same film frame together? > > > > That's not deinterlacing but (inverse) telecine. It has nothing to > > do with interlacing beside the fact that it has the same combing > > effect and results from the same assumption of a interleaved display > > working at a specific frame rate. > > If you have interlaced video data, it needs to be deinterlaced (not up > sampled to double the lines per field). Of course. But interlacing has nothing to do with telecining. Hmm.. after reading again what you wrote in your previous mail, i have the impression that you have a major mix up between telecine and interlaced. Please read: http://en.wikipedia.org/wiki/Telecine http://en.wikipedia.org/wiki/Interlaced http://www.mplayerhq.hu/DOCS/HTML/en/menc-feat-dvd-mpeg4.html#menc-feat-dvd-mpeg4-interlacing http://www.mplayerhq.hu/DOCS/HTML/en/menc-feat-telecine.html Short summary: Telecine is meant to change frame rates from ~24fps to ~30fps which is only needed to show movies on NTSC based TVs. It works by inserting an additional frame every 5 frames, that is generated by using two succeeding frames. Interlacing is needed for all TV based systems because they operate in a (surprise!) interlaced mode, showing half frames (fields) at double rate. > > As such, inverse telecine is quite simple to implement (skip all > > inserted frames and spread the remaining ones equally over the > > time scale) while deinterlacing requires more work. > > It is the same thing write the fields to memory and then read them out > progressively except that inverse telecine does it in a different order. No it's not. Deinterlacing doesn't work by reading the fields in progressiv order. Then you get this awfull comb effect. You have to apply some filtering to the video data to properly deinterlace it. While with inverse telecine you just drop the right frame and you are done. > Or, are you referring to line doubling by vertical interpolation? Again, no. > > As a side note, applying deinterlacing to telecined content > > or inverse telecine to deinterlaced content has quite bad > > effects on the image quality. > > If you are referring to line doubling by vertical interpolation, then > yes you want to avoid doing that to telecine. But there is no point > since you only need to weave the proper fields. > > With a computer monitor, if you can run at 72 f/s this gets much easier. Computer monitors don't run at 72Hz. They run at arbitrary frequencies. And this frequency cannot be changed at will by the graphics card, just because the video it's showing sugests a different refresh rate. > > We don't have to reinvent the wheel, it's already out there. > > We just have to implement it again. > > It is just an English expression. Reimplementing something is > 'reinventing the wheel'. The question is whether it is cheaper to make > your own wheel or buy one that is ready made. My english is good enough to know that proverb. But what i said is true. We do not reinvent the wheel. At least not in colorspace conversion or deinterlacing/inverse telecine, but rather implement one of the known good algorithms in verilog. Yes, it's more work than just using a precooked chip/IP core, but our goal is not to make just a working card, but a 100% fully documented card that is as open as possible while still being comercialy viable. Attila Kinali -- Linux ist... wenn man einfache Dinge auch mit einer kryptischen post-fix Sprache loesen kann -- Daniel Hottinger _______________________________________________ Open-hardware mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-hardware
