On Wed, 9 Feb 2005 23:23:47 +0100, Lourens Veen
<[EMAIL PROTECTED]> wrote:
> On Wednesday 09 February 2005 21:19, Daniel Phillips wrote:
> > On Wednesday 09 February 2005 07:55, Lourens Veen wrote:
> > > Ah. I think I'm starting to see your point about 13 bits not being
> > > enough.
> >
> > I failed to mention a especially easy way to amplify the error:
> > repeating textures.  Say you're viewing a long hall with repeating
> > textures on the wall.  Small errors at the far end of the hall will
> > amplify into multi-texel errors at the near end.  Masking doesn't work
> > because the math is nonlinear.  The easiest fix is to have some
> > precision in reserve and otherwise ignore the problem.
> 
> Urgh! I've been trying to get my head around this, but I just don't grok it.
> We draw trapezoids with the top and bottom horizontal. So we have a linear
> interpolation between the top-left and the bottom-left values, and 

Correct.

> a linear
> interpolation along the right edge. 

No, a linear interpolation horizontally, along the scanline.  Perhaps
you meant "between the left edge and the right edge"?

> That gives us a start and end value for
> each horizontal line, across which we do another linear interpolation to get
> the value for each pixel. Then we divide these values by M to perspective
> correct them, and then we use them to look up texture data or be a colour?

Texture coordinates and colors are computed separately, but basically, yes.

> That doesn't make sense to me at all. What am I missing?

I don't know.  Although you got a few of the details not quite right,
you did a very good job of explaining it.

> > The pragmatic approach is to just ignore this for the first release.
> >
> > > I am fairly sure I can make that full interpolation (which presently
> > > does 15 bits) into full 16 bit precision by storing 17- or 18-bit
> > > values instead of 16-bit ones.
> >
> > I vote for 18 bits, which would guarantee 17 bit precision.  The 17th
> > bit would only live for a short while, being fed immediately into the
> > perspective correction multipliers.  To me, every bit of extra
> > precision is gold, so if it's easy to get, get it.
> 
> Okay, I'm using 18-bit values now, and I've almost got 17 bit precision.
> Almost.
> 
> Each span is 64 units (input is 16 bit, LUT is 10 bit) and I have two spans
> where the very first and last entries have a difference of 0, everything else
> has a difference of 1, except for the middle entry which has a difference of
> 2. I guess we just get really really unlucky there with the rounding.
> 
> Anyway, these can be special cased, and then we have 17-bit precision (using
> two RAM blocks). Maybe I can even get rid of the problem by storing values in
> one table and differences in the other. Differences can then be in 8.10 fixed
> format, which may give enough precision to prevent the problem. Or I could do
> the same thing I'm doing for the 14/4 one-read version and have 19 bits value
> and 17 bits difference over the two tables. That will definitely give 17 bits
> of precision, and it saves a subtract operation.
> 
> > OK, here goes my limited understanding of how this RAM works: it's
> > dual-ported, so each pixel can pick up the start sample on one clock
> > and the following sample on the next clock, which is perfectly ok
> > because latency isn't a problem for the texture divide.  Alternatively,
> > twice as much RAM can be used, encoding an 18 bit sample and an 18 bit
> > difference in each 36 bit word, and look them up together.
> 
> Yes, it can, but the point of pipelining is that you do stuff at the same
> time. So, the pick up of the next sample on the next clock will interfere
> with the pick up of the first sample on the first clock of the next pixel.
> Then each pixel pipeline will still require two reads per clock...
> 
> > Can a 36 bit lookup be dual ported without losing a multiplier?  The
> > earlier discussion left me confused on this point, and wandering
> > through the chip spec hasn't helped.
> 
> Not sure....Timothy, what configurations are possible for these RAM blocks?

36x512, 18x1024, 9x2048, 4x4096.  And only the 36x512 case loses us
the multiplier.

> > I think the simplest arrangement is just to compute the difference on
> > the fly, taking two clocks for the interpolation.  Interpolating the
> > last sample in the table needs a bit of extra logic to handle the table
> > wrap.  Interpolating the zeroth sample needs something special to
> > handle the missing most significant bit for the sample of exactly 1.
> 
> Yeah, they're special cased now. If we use two tables and require 17 bits of
> precision then we can probably even store that extra MSB in the table without
> problems. We'd get a 1.19/8.8 value/difference split. I'll see if I can code
> that.
> 
> > The nice thing about all of this is, the divides seems to be under
> > control.  That's what I worried about most when I first heard of this
> > project.
> 
> I'm glad about that too, but I'm still worried about the gate budget...it
> looks like we're tight on multipliers too. It's not going to be easy, but
> then, if it were easy it would be boring :-).
> 
> Lourens
> _______________________________________________
> Open-graphics mailing list
> [email protected]
> http://lists.duskglow.com/mailman/listinfo/open-graphics
> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
>
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to