Re: [fpc-devel] Division optimisations

2021-09-12 Thread J. Gareth Moreton via fpc-devel
Only two changes appear in the compiled RTL, both in SysUtils, but one is pretty significant.  Here's the "IsLeapYear" function under i386-win32: Trunk: .section .text.n_sysutils_$$_isleapyear$word$$boolean,"ax"     .balign 16,0x90.globl    SYSUTILS_$$_ISLEAPYEAR$WORD$$BOOLEAN

Re: [fpc-devel] Division optimisations

2021-09-11 Thread J. Gareth Moreton via fpc-devel
So I've got some pretty good headway so far! Trunk:   Unsigned 32-bit (n mod 3) = 0 - Pass - average iteration duration: 0.757 ns     Signed 32-bit (n mod 3) = 0 - Pass - average iteration duration: 6.403 ns Unsigned 32-bit (n mod 10) = 0 - Pass - average

Re: [fpc-devel] Division optimisations

2021-09-10 Thread J. Gareth Moreton via fpc-devel
I suppose in truth, I can, and that in itself is probably fairly cross-platform (although I'll stick with x86 for the moment and get that working).  Sometimes the simple solution eludes me!  Is there anything I need to take into account when it comes to range checking (that is, if a third

Re: [fpc-devel] Division optimisations

2021-09-10 Thread Florian Klämpfl via fpc-devel
Am 10.09.21 um 21:17 schrieb J. Gareth Moreton via fpc-devel: Hi everyone, I'm looking at ways to optimise div and mod, starting with x86 and then probably AArch64.  The obvious one is attempting to merge "Q := N div D; R := N mod D;", where D is a variable (but invariant between the two

[fpc-devel] Division optimisations

2021-09-10 Thread J. Gareth Moreton via fpc-devel
Hi everyone, I'm looking at ways to optimise div and mod, starting with x86 and then probably AArch64.  The obvious one is attempting to merge "Q := N div D; R := N mod D;", where D is a variable (but invariant between the two instructions), since DIV returns the quotient in R/EAX and the