On Wed, 23 Feb 2005 19:34:42 -0500, Daniel Phillips <[EMAIL PROTECTED]> wrote: > Hi Timothy, > > Trilinear blending is not in the spec, but I would like to add it. This > is a fairly important feature. Without it, banding at mipmap breaks > will be annoyingly visible. The quality difference between bilinear > and trilinear filtered mipmapping is much greater than, for example, > between trilinear and anisotropic. As a sweetener, trilinear blending > could be used to generate 2D cross fade effects.
I'm quite certain it's in there. I also know for a fact that it's in the model. It hasn't, however, been tested. > > I think trilinear blending can be added at a cost of only 4 8x8 > multipliers and some control logic. Please bear with me as I attempt > to express the concept as conceived from the point of view of a > software guy with a bare inkling of how hardware works. > > First, some review. A bilinear blend combines four color values using > three linear interpolation operations. Given > > lerp(t, a, b) = (1-t)*a + t*b > > then the bilinear blend of four color values c1..c4 is > > lerp(frac(y), lerp(frac(x), c1, c2), lerp(frac(x), c3, c4)) > > To save multiplies, we rewrite lerp as > > lerp(t, a, b) = t*(b - a) + a Now, this I definitely like. It's an optimization that I hadn't thought of, but it will definitely make the logic simpler. > This has to be performed on four color values for each of two pixels per > clock, totalling 24 8x8 multiplies. The pipeline delivers one pair of > bilinear blended pixels per clock in two stages. The first stage has > 16 multiplies and the second, 8. Well, I wasn't going to do it in exactly that order, but okay. > A trilinear blend is two bilinear blends lerped together. This totals > 56 multiplies. We probably can't afford that much real estate for this > feature. I propose to modify the bilinear hardware to do this at half > clock speed, with 4 additional multipliers. Given It doesn't require that many multiplies, actually. Since only two pixels come in per clock, only one multiply has to be done per clock to blend them. I think the whole thing could be done with two or three mulipliers without any performance hit (relatively speaking). > triblend(t1, t2, t3, c[2][4]) = > biblend(t3, biblend(t1, t2, c[0]), biblend(t1, t2, c[1])) > > Then the idea is to repurpose the bilinear blend pipeline by alternating > between the left and right members of the pixel pair, computing the two > bilinear blend components in parallel. Four multipliers appended to > the end of the pipeline complete one trilinear blend per clock, > alternating between left and right pixels. > > How am I doing so far? Not bad. There are two sets of texture registers, but there is only one texture unit that can read only two pixels per clock from RAM. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
