Am 20.07.2020 um 02:37 schrieb J. Gareth Moreton:
On 19/07/2020 22:37, Stefan Glienke wrote:
clang and gcc emit this - I would guess they detect quite some common
patterns like this.
...
cmp eax, edx
mov edx, -1
setg al
movzx eax, al
cmovl eax, edx
ret
I think I can make improvements to that already! (Note the sequence
above and below are in Intel notation)
CMP EAX, EDX
MOV EAX, 0 ; Note: don't use XOR EAX, EAX because this scrambles the
FLAGS register
MOV EDX, -1
SETG AL
CMOVL EAX, EDX
RET
I believe that executes one cycle faster (20% faster for the entire
sequence) on modern processors because it shortens the dependency
chain that exists between "SETG AL; MOVZX EAX, AL; CMOVL EAX, EDX". It
might require some testing though to be sure.
That is what clang does (the first snippet I posted) by using ecx for
the 0 and it does so with the shorter xor before the cmp which results
in 16bytes of code - gcc is 17, yours 19.
Anyhow they all don't differ in execution speed but are 2.5 times faster
than the double cmp and cond jump galore ;)
--
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel