Am 12.06.2018 um 14:42 schrieb J. Gareth Moreton:
Hi everyone,

Sorry to flood the mailing list again with more ideas and experiments.

I would like to propose introducing a new pair of in-built functions for the 
compiler.

function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;

FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns the 
base-2 logarithm of X, rounded up.

To give examples where these functions could be useful, FLog2(X) + 1 indicates the minimum number of bits required to store X as an unsigned integer, while CLog2(X) is equal to the maximum number of iterations required for a binary search of an X-sized list.

Why should they be in-built though? With the binary search example, the size of the list is sometimes known at compile time, hence is a constant - therefore, its logarithm is also a constant.  By pre-calculating the logarithm using the in-built function, it can be used to aid optimization such as loop unrolling.  It also makes them easier to inline, where FLog2(X) on x86_64-win64 translates to a single line of assembly language: BSR EAX, ECX (unless X is zero, in which case ZF is set and EAX is undefined).

If there had to be a slight nuance though, it's what happens if X = 0, since log(0) = -oo, which cannot be stored as an integer and may cause problems with the compiler.  I would propose it return 0 as a special case, as this will fix most problems when it comes to loops and storage, and also ensure that FLog2 and CLog2 are "entire functions".  To give an optimised example of FLog(2) in x86-64 assembly language:

XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.

Some kind of deep optimization could be used if the input is known to be non-zero and remove the instructions either side of BSR.


Here again: I would first try find a good pascal implementation of the two functions, then check, what FPC produces, actually, FPC has already a BsrDWord function which is very similar to FLog2, though it returns 255 for 0. And then I would propose to improve the optimizer to generate code close to hand coded assembler.

_______________________________________________
fpc-devel maillist  -  [email protected]
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to