Am 12.06.2018 um 14:42 schrieb J. Gareth Moreton:
Hi everyone,
Sorry to flood the mailing list again with more ideas and experiments.
I would like to propose introducing a new pair of in-built functions for the
compiler.
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns the
base-2 logarithm of X, rounded up.
To give examples where these functions could be useful, FLog2(X) + 1 indicates the minimum number of bits required to
store X as an unsigned integer, while CLog2(X) is equal to the maximum number of iterations required for a binary search
of an X-sized list.
Why should they be in-built though? With the binary search example, the size of the list is sometimes known at compile
time, hence is a constant - therefore, its logarithm is also a constant. By pre-calculating the logarithm using the
in-built function, it can be used to aid optimization such as loop unrolling. It also makes them easier to inline,
where FLog2(X) on x86_64-win64 translates to a single line of assembly language: BSR EAX, ECX (unless X is zero, in
which case ZF is set and EAX is undefined).
If there had to be a slight nuance though, it's what happens if X = 0, since log(0) = -oo, which cannot be stored as an
integer and may cause problems with the compiler. I would propose it return 0 as a special case, as this will fix most
problems when it comes to loops and storage, and also ensure that FLog2 and CLog2 are "entire functions". To give an
optimised example of FLog(2) in x86-64 assembly language:
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.
Some kind of deep optimization could be used if the input is known to be non-zero and remove the instructions either
side of BSR.
Here again: I would first try find a good pascal implementation of the two functions, then check, what FPC produces,
actually, FPC has already a BsrDWord function which is very similar to FLog2, though it returns 255 for 0. And then I
would propose to improve the optimizer to generate code close to hand coded assembler.
_______________________________________________
fpc-devel maillist - [email protected]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel