[fpc-devel] Undefined symbol during linking - any suggestions?

2021-09-03 Thread Gennady Agranov via fpc-devel
Hi, I have rather large program that fails to link - "Undefined symbol: .Lj3016" The only difference with previous cases when link was successful is that I added "indylaz" package - not sure if there is a connection... What can I do or try to diagnose/resolve this issue? (9022) Compiling

Re: [fpc-devel] The "magic div" algorithm

2021-09-03 Thread J. Gareth Moreton via fpc-devel
Fixed the problem.  I was deallocating a register too soon, so it got overwritten in some cases while still in use.  Running tests again. Gareth aka. Kit -- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

Re: [fpc-devel] The "magic div" algorithm

2021-09-03 Thread J. Gareth Moreton via fpc-devel
Div by 3 has a very elegant implementation.  In 32-bit, it's movl    $0xAAAB,%eax mull    Input ; answer is in %edx:%eax shrl    $1,%edx movl    %edx, Result As Marģers discovered, you can reduce the cycle count and minimise pipeline stalls by extending to 64-bit and applying the shift to

Re: [fpc-devel] The "magic div" algorithm

2021-09-03 Thread Ched via fpc-devel
Very interesting, Gareth ! Is the div-by-7 related to 2 to the 3rd ? If yes, is it possible to design a div-by-3 with similar magics ? Cheers, Ched' Le 03.09.21 à 15:35, J. Gareth Moreton via fpc-devel a écrit : Hey Marģers, So I've been experimenting with your suggestion, and it looks

Re: [fpc-devel] The "magic div" algorithm

2021-09-03 Thread J. Gareth Moreton via fpc-devel
Hey Marģers, So I've been experimenting with your suggestion, and it looks like a resounding success!  I added some new tests to the "bdiv" bench test to see how it performs.  16-bit multiplications don't get improved as well as they could be on x86_64 because the intermediate values are all