On Wednesday 09 February 2005 23:23, Lourens Veen wrote:
> On Wednesday 09 February 2005 21:19, Daniel Phillips wrote:
> > I vote for 18 bits, which would guarantee 17 bit precision.  The 17th
> > bit would only live for a short while, being fed immediately into the
> > perspective correction multipliers.  To me, every bit of extra
> > precision is gold, so if it's easy to get, get it.
>
<snip>
>
> Anyway, these can be special cased, and then we have 17-bit precision
> (using two RAM blocks). Maybe I can even get rid of the problem by storing
> values in one table and differences in the other. Differences can then be
> in 8.10 fixed format, which may give enough precision to prevent the
> problem. Or I could do the same thing I'm doing for the 14/4 one-read
> version and have 19 bits value and 17 bits difference over the two tables.
> That will definitely give 17 bits of precision, and it saves a subtract
> operation.

Replying to self due to sudden brainwave: 36 = 18 + 9 + 9. Let's assume we 
take two RAM blocks. An extra RAM block will improve performance a lot, and 
there are 40 of them, so I'd say we can afford it.

Now, my 14/4 encoding almost gets 13 bits. I haven't tried, but I think I can 
squeeze an actual 13 bits out of 14/5. If I can do 13 bits precision out of 
14/5, then I can probably also do 17 bits precision out of 18/9 (since our 
earlier theoretical work shows that the limitations of linear interpolation 
only get into play from 23 bits onwards). That leaves us with 9 bits left 
over if we read 36 bits per pixel. What I'd like to suggest is an 18/9/9 
value/(difference-bias)/(difference*3)>>2 encoding. That looks rather 
esotheric, so let me explain.

We want 17 bits precision, so we need 18 bits values for the start of the 
span. The difference to the next value (begin of the next span) will be 
between 128 and 512 for 18-bit values, so we can cram it into 9 bits using a 
small bias. 512*3 =1536, so we need 11 bits to store difference*3, which we 
don't have, so we drop the two least significant bits. The LSB is the same as 
that of difference (so we don't need to store it again), and the second least 
significant bit can be calculated fairly easily from the two least 
significant bits of difference.

So, our reciprocal routine gets a value v, a difference d, and d*3. Now what 
it needs to calculate is v + (d * f) where f is the fraction of the input 
mantissa, which is 16 - 10 = 6 bits. Writing d * f as an addition, it is

d*f0 + (d*2)*f1 + (d*4)*f2 + (d*8)*f3 + (d*16)*f4 + (d*32)*f5

with fn the n'th bit of f.

Now, we have d, and d*3, and we can easily obtain d*2 as d << 1. Now rewrite 
the whole thing as

d*(2*f1 + f0) + 4*d*(2*f3 + f2) + 16*d*(2*f5 + f4)

and note how 2*fn + f(n-1) is in the range [0,3]. Multiplying this by d is 
just a simple MUX selection of 0, d, d << 1 or d*3. Multiplying by 4 or 16 is 
a fixed shift, which is free.

That means that we would be able to do the multiplication with two adders and 
three MUXes. Then we need to add the result to the value which is another 
adder, and we need a simplified adder (for the bias, with a fixed second 
parameter) and a few extra gates to get the values out of the table. A 17-bit 
reciprocal using two RAM blocks, four adders, three MUXes and a few loose 
gates doesn't sound too bad. And it won't block any multipliers either.

I'm going to try it out tomorrow. I'm fairly sure I can get 16 bits (and maybe 
even drop the bias and thus an adder in the process), but not so sure I can 
get 17 bits. On the other hand, if we only put 16 bits in it, won't the extra 
bit in the output be meaningless anyway?

Lourens
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to