That sounds like a really neat idea.  Mind you, if we can avoid using
multipliers, that would be even better.  As it is, I'm not sure we'll
have enough.


On Wed, 2 Feb 2005 09:52:34 +0100, Lourens Veen
<[EMAIL PROTECTED]> wrote:
> On Tuesday 01 February 2005 19:33, Daniel Phillips wrote:
> > > > > Therein lies a problem.  Since the reciprocal isn't precise (we
> > > > > use only 10 mantissa bits when computing it)
> > > >
> > > > Hmm, I thought we had 18 bits of precision readily available.  Is
> > > > this a consequence of using linear interpolation for the divide?
> > >
> > > I was also under the impression that 10 mantissa bits were used for
> > > the LUT, and the other bits were used for linear interpolation
> > > between two 18 bit values from the LUT. This should actually yield a
> > > pretty good result, I think Nicolas Caspens was the one who
> > > contributed most of this in the original discussion (obviously, I may
> > > be mistaken, so please don't kill me if I got the attribution wrong).
> > > In any case, with linear interpolation the precision should be *much*
> > > better than 10 bits.
> >
> > Anyway, if interpolation doesn't work out for some reason there's always
> > Newton-Raphson, which is tried and true.  I seem to recall that
> > Newton-Raphson needs two multipliers for the single iteration step
> > required, so if linear interpolation can do the job with one then I
> > guess it's better.
> 
> I've been thinking about this for a bit. How about the following. Instead of
> just storing 16 bits of the reciprocal, how about storing both the reciprocal
> and its derivative in those 18 bits? Then we would essentially have a
> quantised approximation to a piecewise linear approximation to 1/x, rather
> than a quantised approximation to 1/x. The numbers would have to be adjusted
> slightly because we truncate rather than round to get the table index, but
> that's doable. The question is how we divide those 18 bits over the two
> numbers.
> 
> Calculating the final number would then be something like
> Read 1 18-bit word using lines 15:6 of the input for the address
> Take bits 5:0 of the result, multiply by bits 5:0 of the input, and add to
> bits 17:6 of the result
> 
> That would fit the RAM gate constraints for a two-pixel pipeline, and require
> only a single multiplier. The question is how accurate it is and whether it's
> worth it.
> 
> What is the input range for this? 16 bits, but what does it map to? And how
> should the output be represented? If I can find the time I might just write a
> test program and see if I can figure out what the best split is and how good
> it is.
> 
> Lourens
> _______________________________________________
> Open-graphics mailing list
> [email protected]
> http://lists.duskglow.com/mailman/listinfo/open-graphics
> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
>
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to