On Wed, 9 Feb 2005 23:23:47 +0100, Lourens Veen <[EMAIL PROTECTED]> wrote: > On Wednesday 09 February 2005 21:19, Daniel Phillips wrote: > > On Wednesday 09 February 2005 07:55, Lourens Veen wrote: > > > Ah. I think I'm starting to see your point about 13 bits not being > > > enough. > > > > I failed to mention a especially easy way to amplify the error: > > repeating textures. Say you're viewing a long hall with repeating > > textures on the wall. Small errors at the far end of the hall will > > amplify into multi-texel errors at the near end. Masking doesn't work > > because the math is nonlinear. The easiest fix is to have some > > precision in reserve and otherwise ignore the problem. > > Urgh! I've been trying to get my head around this, but I just don't grok it. > We draw trapezoids with the top and bottom horizontal. So we have a linear > interpolation between the top-left and the bottom-left values, and
Correct. > a linear > interpolation along the right edge. No, a linear interpolation horizontally, along the scanline. Perhaps you meant "between the left edge and the right edge"? > That gives us a start and end value for > each horizontal line, across which we do another linear interpolation to get > the value for each pixel. Then we divide these values by M to perspective > correct them, and then we use them to look up texture data or be a colour? Texture coordinates and colors are computed separately, but basically, yes. > That doesn't make sense to me at all. What am I missing? I don't know. Although you got a few of the details not quite right, you did a very good job of explaining it. > > The pragmatic approach is to just ignore this for the first release. > > > > > I am fairly sure I can make that full interpolation (which presently > > > does 15 bits) into full 16 bit precision by storing 17- or 18-bit > > > values instead of 16-bit ones. > > > > I vote for 18 bits, which would guarantee 17 bit precision. The 17th > > bit would only live for a short while, being fed immediately into the > > perspective correction multipliers. To me, every bit of extra > > precision is gold, so if it's easy to get, get it. > > Okay, I'm using 18-bit values now, and I've almost got 17 bit precision. > Almost. > > Each span is 64 units (input is 16 bit, LUT is 10 bit) and I have two spans > where the very first and last entries have a difference of 0, everything else > has a difference of 1, except for the middle entry which has a difference of > 2. I guess we just get really really unlucky there with the rounding. > > Anyway, these can be special cased, and then we have 17-bit precision (using > two RAM blocks). Maybe I can even get rid of the problem by storing values in > one table and differences in the other. Differences can then be in 8.10 fixed > format, which may give enough precision to prevent the problem. Or I could do > the same thing I'm doing for the 14/4 one-read version and have 19 bits value > and 17 bits difference over the two tables. That will definitely give 17 bits > of precision, and it saves a subtract operation. > > > OK, here goes my limited understanding of how this RAM works: it's > > dual-ported, so each pixel can pick up the start sample on one clock > > and the following sample on the next clock, which is perfectly ok > > because latency isn't a problem for the texture divide. Alternatively, > > twice as much RAM can be used, encoding an 18 bit sample and an 18 bit > > difference in each 36 bit word, and look them up together. > > Yes, it can, but the point of pipelining is that you do stuff at the same > time. So, the pick up of the next sample on the next clock will interfere > with the pick up of the first sample on the first clock of the next pixel. > Then each pixel pipeline will still require two reads per clock... > > > Can a 36 bit lookup be dual ported without losing a multiplier? The > > earlier discussion left me confused on this point, and wandering > > through the chip spec hasn't helped. > > Not sure....Timothy, what configurations are possible for these RAM blocks? 36x512, 18x1024, 9x2048, 4x4096. And only the 36x512 case loses us the multiplier. > > I think the simplest arrangement is just to compute the difference on > > the fly, taking two clocks for the interpolation. Interpolating the > > last sample in the table needs a bit of extra logic to handle the table > > wrap. Interpolating the zeroth sample needs something special to > > handle the missing most significant bit for the sample of exactly 1. > > Yeah, they're special cased now. If we use two tables and require 17 bits of > precision then we can probably even store that extra MSB in the table without > problems. We'd get a 1.19/8.8 value/difference split. I'll see if I can code > that. > > > The nice thing about all of this is, the divides seems to be under > > control. That's what I worried about most when I first heard of this > > project. > > I'm glad about that too, but I'm still worried about the gate budget...it > looks like we're tight on multipliers too. It's not going to be easy, but > then, if it were easy it would be boring :-). > > Lourens > _______________________________________________ > Open-graphics mailing list > [email protected] > http://lists.duskglow.com/mailman/listinfo/open-graphics > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
