Date: Wed, 7 Mar 2018 00:33:04 -0800
From: Eitan Adler <li...@eitanadler.com>
| I'd like to commit the patch below. Does anyone have concerns with it?
The change looks fine technically, but it would be good to see some benchmark
results before committing it - particularly for the more common case where
x != 1.0 (but including where x == 2.0 or 0.5) - the change swiches from using
arithmetic and a single branch to multiple branches (or would with simplisitc
compilation) and branches are slow wrt prefetch and parallel execution.
Do make sure the benchmark tests atan2() though, not the loop which surrounds
it by making a loop with a lot of atan2() calls it, one after another (even
calling over and over again with the same arg.)
If the new version slows things down (which it might have once, but now the
compilers are smarter, and there might be no difference) then perhaps we
can find some other way of writing the expression that avoids overflows, and
is still fast.