Am 20.07.2020 um 02:37 schrieb J. Gareth Moreton:

On 19/07/2020 22:37, Stefan Glienke wrote:
clang and gcc emit this - I would guess they detect quite some common patterns like this.

 ...
  cmp     eax, edx
  mov     edx, -1
  setg    al
  movzx   eax, al
  cmovl   eax, edx
  ret

I think I can make improvements to that already! (Note the sequence above and below are in Intel notation)

CMP   EAX, EDX
MOV   EAX, 0 ; Note: don't use XOR EAX, EAX because this scrambles the FLAGS register
MOV   EDX, -1
SETG   AL
CMOVL EAX, EDX
RET

I believe that executes one cycle faster (20% faster for the entire sequence) on modern processors because it shortens the dependency chain that exists between "SETG AL; MOVZX EAX, AL; CMOVL EAX, EDX". It might require some testing though to be sure.

That is what clang does (the first snippet I posted) by using ecx for the 0 and it does so with the shorter xor before the cmp which results in 16bytes of code - gcc is 17, yours 19.

Anyhow they all don't differ in execution speed but are 2.5 times faster than the double cmp and cond jump galore ;)


--
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to