Hi Lourens,
On Friday 11 February 2005 12:08, Lourens Veen wrote:
> I'm a bit worried about the double precision floating point divide in your
> inner loop...
Since 'precision' is a power of two, it is just a shift and even the shift
will disappear when converted to logic. This is merely the int to float
conversion.
> Also, why is frac_bits 2? If you have a 16-bit input, and a 10-bit table
> index, then surely you have a 6-bit fraction for interpolation?
I print out the error for every possible input, so if I allow too many
inputs it takes too long to run. The number of fact_bits mainly affects
the number of distinct inputs and affects the error very little. If you
set it to 6 you will get essentially the same results.
> I'm not sure I quite understand the code.
Sorry! I thought it was too short to need any explaination. (Just joking.)
My less precise interpolation is done with the expression:
sample - ((j*delta) >> frac_bits)
I subtract the ratio rather than add because I want to use an unsigned
multiply. I know the function decreases monotonically, so I obtain delta
by reverse subtraction to give a positive input to the multiply. This
simple form gives an error range of [-3.809740e-06, 2.898030e-06] which is
plus or minus a little less than 1/2**18. That's 17 bits of accuracy, but
I'm greedy so I wanted to see if I could get 18.
I got part of the way there with yesterday's code. Today's code (attached)
does the rest of the job. Instead of throwing away part of the product, I
shift the sample point before adding, thus gaining frac_bits of extra
precision, just enough to hold the exact product. This practically
eliminates the interpolation error, leaving only the sample error, which we
can't do much about. The error is now [-3.809740e-06, 6.415877e-08] which
is mostly negative due to sample quantization, with a tiny positive error
due to interpolation. So I round by adding precision/2 (expressed as an
integer in terms of frac_bits) giving:
(sample << frac_bits) - j*delta + (1 << (frac_bits - 1));
This shifts the improved error range to [-1.902391e-06, 1.971507e-06] or
just a tiny bit more than 1/2**19. So I got my full 18 bits of precision,
less a little sliver that isn't worth worrying about.
Just to be sure, I wrote:
if ((int)(error*precision))
goto quit;
which aborts if any result shows error in the 18th bit. My less precise
interpolation passes this test too, but only by cheating: both positive and
negative error are truncated towards zero, so it will survive an error
range of plus or minus 2**-18, which is only 17 bit precision.
With a little more tweaking, it is no doubt possible to eliminate most of
the logic needed for the extra frac_bits of precision in the sum. Maybe
the logic synthesizer will take care of that automatically. I'm quite
satisfied with the results as they are. As I mentioned yesterday, if
there's actually a use for 19 bits of precision, we could get it by not
storing the most significant bit of the sample, which is always 1.
Regards,
Daniel
#include <stdio.h>
#include <math.h>
#define table_size 1024
#define sample_bits 18
#define frac_bits 2
int main(void)
{
double neg_error = 0, pos_error = 0;
int table[table_size + 1];
int precision = (1 << sample_bits);
int i, j;
for (i = 0; i <= table_size; i++)
table[i] = (1 << sample_bits)/(1 + ((float)i / table_size));
for (i = 0; i < table_size; i++) {
int sample = table[i], delta = sample - table[i + 1];
printf("%i: [0x%x] error: ", i, sample);
for (j = 0; j < 1 << frac_bits; j++) {
#if 1
// more precise: keep extra interpolation bits around
int interp = (sample << frac_bits) - ((j * (long long)delta));
double approx = (double)interp / (precision << frac_bits);
#else
// less precise: truncate interp to same length as sample
int interp = sample - ((j * (long long)delta) >> frac_bits);
double approx = (double)interp / precision;
#endif
double exact = 1/((double)((i << frac_bits) + j) /
(double)(table_size << frac_bits) + 1);
double error = approx - exact;
printf("%.2e ", error);
if (error > pos_error)
pos_error = error;
if (error < neg_error)
neg_error = error;
if ((int)(error*precision))
goto quit;
}
printf("\n");
}
quit:
printf("max error = [%e, %e]\n", neg_error, pos_error);
return 0;
}
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)