On Wednesday 16 February 2005 03:02, Daniel Phillips wrote: > On Tuesday 15 February 2005 18:44, Lourens Veen wrote: > > On Tuesday 15 February 2005 23:47, I wrote: > > > At this point I'm fretting more about DDA precision than the > > > perspective divide. I seriously doubt we'll get stable results > > > stepping across the whole screen with 16 bit precision. I'm > > > mulling over a couple of suggestions, short of adding more bits. > > > > ...At any rate, could we assume a maximum of 2048x1536 for the > > resolution? That means 11 bits for integer screen coordinates, so > > we'd have an 11.5 split if we did fixed point (and floating point > > just doesn't make sense to me here really, comments?). > > Floating point makes a whole lot of sense here because we don't have > much control over the input parameters, which can vary over wide > ranges. The W divide introduces a further, large degree of variation > for new and far objects. A typical nasty case is a viewpoint near a > huge vertical plane running far into the distance, i.e., looking along > the side of a building.
Colours are still relatively limited I'd say. But I see your point, I hadn't thought of texture coordinates outside of the texture, but of course that happens when the texture wraps. > > and what if we drew the span from > > both sides to cut the error in half (ie, we do two pixels at a time, > > but not next to one another, but one starting at the left side of the > > span and the other on the right side)? > > This problem needs stronger medicine than just cutting the damage in > half. Also, meeting in the middle makes any cumulative error easy to > see. The tears down the middle of triangles will crawl around the > screen in a distracting way. And it's not going to be friendly to the > DRAM interface. It was more intended as an additional measure, not a complete one. But I see your point, it's not a good idea. > > Just some thoughts. > > Three options that seem viable to me are: > > 1) Correct each interpolant on the fly using a single multiplier in a > round robin. > > 2) Chop up big geometry in the driver vertically and horizontally into > bite size chunks. > > 3) More bits, just for the interpolants. Okay, you're with 2), I'll take the other two :-). Let's take a look at the vertical interpolation of, for example, the X1 coordinate. It's a linear interpolation, so the formula is X1 = X1_0 + dX1dY * (Y - Y_0) which we calculate as X1 = X1_0 + dX1dY + dX1dY + dX1dY + ... + dX1dY // (Y - Y_0 additions) and if I understand correctly, the problem is that we do not have enough bits in dX1dY and X1 so that the error accumulates as Y grows. So, because of rounding, we do not store dX1dY, but (dX1dY + delta). If the fractional part of dX1dY is n bits, then |delta| < 2**-(n+1). So, worst case, what we actually calculate by cumulatively adding is X1 = X1_0 + (dX1dY + 2**-(n+1)) * (Y - Y_0) = X1_0 + dX1dY * (Y - Y_0) + (2**-(n+1) * (Y - Y_0)) or X1 = X1_0 + dX1dY * y + err * y with y = (Y - Y_0) and err = (2**-(n+1)) A real multiplier would not calculate the product by doing y additions. Instead, it does X1 = X1_0 + dX1dY * y:0 + (dX1dY << 1) * y:1 + ... + (dX1dY << m) * y:m We want to do this incrementally. Take the the bit representation of y, which runs from 0 to height H. We can write y as the sum of a previous value of y, and a power of two: 0 = 000 1 = 001 = 000 + 001 2 = 010 = 000 + 010 3 = 011 = 010 + 001 4 = 100 = 000 + 100 5 = 101 = 100 + 001 6 = 110 = 100 + 010 7 = 111 = 110 + 001 ... 1024 = 0 + 1 << 10 Multiply by dX1dY and you get X1[0] = 0 X1[1] = 1 * dX1dY = X1[0] + dX1dY X1[2] = 2 * dX1dY = X1[0] + dX1dY << 1 X1[3] = 3 * dX1dY = X1[2] + dX1dY X1[4] = 4 * dX1dY = X1[0] + dX1dY << 2 X1[5] = 5 * dX1dY = X1[4] + dX1dY X1[6] = 6 * dX1dY = X1[4] + dX1dY << 1 X1[7] = 7 * dX1dY = X1[6] + dX1dY ... X1[1024] = 1024 * dX1dY = X1[0] + dX1dY << 10 (or X1[x] = X1[x & (x-1)] + dX1dY << number-of-rightmost-zeros and note that we can latch x-1 and use a priority encoder on the inverted carry outputs of the incrementor to get the shift factor) That's still one addition per increment, and if we store enough bits of dX1dY (ie, 10 bits fraction in this example, so that dX1dY becomes dX1dY' >> 10) and start with X1[0] = 0.5 + X1_0 (to round properly) then the rounding error grows with the 2log of y, which is the same rate you lose precision at in a limited precision floating point number. Unless I'm missing something this means perfect results without a multiplier at all. We do need a bunch of registers to store the intermediate results however, 15 per interpolant for a 16-bit mantissa, and I'm not sure how expensive that gets. The RAM blocks don't have enough ports to use them effectively (it would require 1 RAM block per interpolant for the horizontal rasterisation, which is rather wasteful). > The second of these seems the most pragmatic since it can be offloaded > to the host, saving gates and Timothy cycles. The extra work would > level out nicely because larger triangles incur most of the penalty, > and there should be fewer of them. For best visual stability, the > clipping planes would form a rectangular mesh in screen space. Note > that this means small triangles do not necessarily escape intact, > however they are more likely to. I guess we really need to know how big this error can get, and how far we can go without creating artifacts. So we need a theoretical worst case. The problem seems to be that you can rotate a triangle arbitrarily close to edge-on, so you can always make it worse... How would mipmapping influence this? Needs thought... > > Incidentally, these differentials are calculated > > on the host, not the card, right? > > Most probably off-card because we ran out of multipliers some time ago. > The problem with that is, it really bulks up the DMA stream, and PCI > bandwidth is already tight. This is probably just a case of grin and > bear it. Yeah, and I don't see us doing all those reciprocals in parallel in hardware either. Perhaps we can still do something with the colour values, since they have a limited range. Lourens _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
