On Friday, 24 May 2019 at 08:33:34 UTC, Ola Fosheim Grøstad wrote:
On Thursday, 23 May 2019 at 21:47:45 UTC, Alex wrote:
Either way, sin it's still twice as fast. Also, in the code the sinTab version is missing the writeln so it would have been faster.. so it is not being optimized out.

Well, when I run this modified version:

https://gist.github.com/run-dlang/9f29a83b7b6754da98993063029ef93c

on https://run.dlang.io/

then I get:

LUT:    709
sin(x): 2761

So the LUT is 3-4 times faster even with your quarter-LUT overhead.

FWIW, as far as I can tell I managed to get the lookup version down to 104 by using bit manipulation tricks like these:

auto fastQuarterLookup(double x){
const ulong mantissa = cast(ulong)( (x - floor(x)) * (cast(double)(1UL<<63)*2.0) ); const double sign = cast(double)(-cast(uint)((mantissa>>63)&1));
    … etc

So it seems like a quarter-wave LUT is 27 times faster than sin…

You just have to make sure that the generated instructions fills the entire CPU pipeline.


Reply via email to