On Tue, 8 Feb 2005 16:06:05 -0500, Daniel Phillips <[EMAIL PROTECTED]> wrote: > On Tuesday 08 February 2005 15:06, Lourens Veen wrote: > > Yep. I'd already found out empirically that when doing linear > > interpolation between 16-bit values I would get 15 bits precision. > > OTOH, I guess it should be possible to make it 16...I'll have a look. > > Or maybe using 18-bit words in the LUT could improve that to 16 bits. > > The problem is that you need to read two values per pixel, and the > > RAM is dual-ported, so we can't do two pixels at the same time. > > Unless we use two RAM blocks, one table for each pixel. How scarce > > are these things? > > There are 40, however I think I saw Timothy say that the 40 RAM blocks > and the 40 dedicated multipliers come out of the same budget, which > would make things tight. Fortunately, only two reciprocal units are > needed, so using 10% of the RAM/Multiplier budget (if that's how it > works) doesn't seem too horrible.
Not exactly. 36-bit RAM blocks take away from the multiplier budget. Any other width of RAM block doesn't cost us anything. > > It's the interpolants that are really going to eat multipliers: > > Horizontal rasterization: > > - two multiplies per interpolant for perspective correction > > Vertical rasterization: > > - one multiply per interpolant to correct for pixel alignment Well, here's what I think may have to happen (and it's going to kinda suck): Since there's an alignment correction for each interpolant, plus we have to do the vertical interpolation, I suggest we use 2 or maybe up to 4 multpliers and have the vertical logic iterate over the interpolants. For 17 interpolants and 3 multipliers, that's 6 cycles to compute all interpolants so that the horizontal units can work on them. But that's only 3 fp adders and 3 fp multipliers (gotta design one of those!). > With 17 interpolants, most of which need perspective correction (in my > opinion; some may think this justifiable only for textures) we've > already exceeded our multiplier budget and haven't even begun to think > about filtering, blending, mipmapping, fog and probably other things. That's 17 multiplies, but we can do some looping if we have to. For instance, we don't have to do both sets of texture coordinates at the same time. That saves 2, at least. > So pretty soon it's time to make some hard choices about what is > expendable, where to compromise on quality and throughput, and how > throughput is going to degrade gracefully as features are turned on. > All of which I'm sure Timothy has been thinking about, but now it's > about time to take inventory and see just how bad things are. Yeah, this going to be a challenge. > It's also possible to create more multipliers in random logic, as > Timothy mentioned several times, but this is only going to work out in > places where precision is really limited. I can work for any precision, but it requires a lot of piplining. Then again, we're already going to have gobs of pipelining. Then it's only a matter of transistor budget. > I hope I'm wrong about multipliers and block RAM coming out of the same > budget. If we use only 18-bit or 9-bit RAMs (you can also do 4), then we're okay. We'll likely face some choices here and there. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
