As mentioned on IRC, enabling optimizations with MODE=release and picking
clang++ (6.0 here) vs g++ (7.4 here) makes major differences when benchmarking
exp2f and log2f from glibc against our approximations. On a modern AMD64
processor, glibc is often faster. Internally it also uses polynomials around
order 4, but picks its coefficients from a table depending on the input
argument. With that it achieves errors < 1 ULP and is often speedier because
it can also use hand crafted SSE2 implementations.
I haven't had a chance to benchmark the approximations on musl, but so far,
based on your submission, I'm inclined to integrate the following:
1) Rename bse_approx6_exp2 to fast_exp2() and get rid of the other
approximation variants.
2) Add fast_log2() based on your 6th order version, but with error correction
for integer logarithms.
3) When building for AMD64, use exp2f to implement fast_exp2 and use log2f to
implement fast_log2.
Here's the error correction I'm talking about, note that exchanging "long
double" for "float" makes the code significantly slower, because it forces the
compiler to add code to reduce precision. On my machine, this version is
roughly as fast as log2f when compiling with optimizations, with both compilers:
static inline long double G_GNUC_CONST
fast_log2f (float value)
{
union {
float f;
int i;
} float_u;
float_u.f = value;
// compute log_2 using float exponent
const int log_2 = ((float_u.i >> 23) & 255) - 128;
// replace float exponent
float_u.i &= ~(255 << 23);
float_u.i += BSE_FLOAT_BIAS << 23;
long double u, x = float_u.f;
// lolremez --long-double -d 6 -r 1:2
"log(x)/log(2)+1-0.00000184568668708"
u = -2.5691088815846393966e-2l;
u = u * x + 2.7514877034856806734e-1l;
u = u * x + -1.2669182593669424748l;
u = u * x + 3.2865287704176774059l;
u = u * x + -5.3419892025067624343l;
u = u * x + 6.1129631283200211528l;
x = u * x + -2.040042118396715321l;
return x + log_2;
}
Error samples, compared to LOG2L(3):
+0.0, -0.00000231613294631
+0.5, +0.00000000000000000
+1.0, +0.00000000000000000
+1.1, -0.00000181973000285
+1.5, -0.00000130387210186
+1.8, -0.00000312228549678
+2.0, +0.00000000000000000
+2.2, -0.00000181973000285
+2.5, -0.00000140048214306
+3.0, -0.00000130387210186
+4.0, +0.00000000000000000
+5.0, -0.00000140048214306
+6.0, -0.00000130387210186
+7.0, -0.00000312228549678
+8.0, +0.00000000000000000
+9.0, -0.00000084878575295
+10.0, -0.00000140048214306
+11.0, -0.00000368176020430
+16.0, +0.00000000000000000
+32.0, +0.00000000000000000
+40.0, -0.00000140048214306
+48.0, -0.00000130387210186
+54.0, -0.00000149844406951
+64.0, +0.00000000000000000
+127.0, -0.00000162654178981
+128.0, +0.00000000000000000
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tim-janik/beast/pull/124#issuecomment-530135495_______________________________________________
beast mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/beast