Am 03.02.19 um 06:26 schrieb J. Gareth Moreton:
Hi everyone,
So I'm looking to improve some of the mathematical routines. However,
not all of them are internal functions and are stored in the Math
unit.. Some of them are written in assembly language but use the old
floating-point stack, or use a slow hack when there's a good alternative
available in SSE 4.1, for example, and I would like to see about
rewriting some of these functions for x86_64. However, while I can
safely assume the presence of SSE2 on this architecture, what's the best
way to detect if "-iCOREAVX" etc are specified? Also, if "-iCOREAVX",
does it automatically set "-fAVX" as well? I rather make sure I'm not
making incorrect assumptions before I start writing assembly language
routines.
As an example of a function that can benefit from a speed-up under
x86_64... the floor() and floor64() functions:
function floor64(x: float): Int64;
begin
Result:=Trunc(x)-ord(Frac(x)<0);
end;
For time-critical code, this is not ideal because, besides being a
function itself, it calls Trunc, Frac, has a subtraction, and another
implicit subtraction and assignment due to the condition. Under SSE4.1,
this could be optimised to something like the following:
Better make it inline, detect the node pattern and then generate the
right instructions depending on the fpu switches. While this is still a
"micro" optimization, it has its maximum benefit and does not clutter
rtl units with assembler and user code using similar sequences benefit
from it as well.
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel