> Am 16.04.2022 um 06:49 schrieb J. Gareth Moreton via fpc-devel
> <[email protected]>:
>
> Hi everyone,
>
> In the x86_64 assembly dumps, I frequently come across combinations such as
> the following:
>
> cmpl %ebx,%edx
> cmovll %ebx,%eax
> cmovnll %edx,%eax
>
> This is essentially the tertiary C operator "x = cond ? trueval : falseval",
> or in Pascal "if (cond) then x := trueval else x := falseval;". However,
> because the CMOV instructions have exact opposite conditions, is it better to
> optimise it into this?
>
> movl %ebx,%eax
> cmpl %ebx,%edx
> cmovnll %edx,%eax
>
> It's smaller, but is it actually faster (or the same speed)? At the very
> least, the two CMOV instructions depend on the CMP instruction being
> completed, but I'm not sure if the second CMOV depends on the first one being
> evaluated (because of %eax). With the second block of code, the MOV and CMP
> instructions can execute simultaneously.
>
> My educated guess tells me that MOV/CMP/CMOV(~c) is faster than
> CMP/CMOVc/CMOV(~c), but I haven't been able to find an authoritive source on
> this yet.
cmov is normally slow, so the latter should be slower, a brief test shows this
also.
$ cat tbench1.pp
procedure p;
var
a,b,c : array[0..100] of longint;
i,j,e,f,g : longint;
begin
for j:=low(a) to high(a) do
begin
a[j]:=random(10);
b[j]:=random(10);
end;
for i:=1 to 10000000 do
for j:=low(a) to high(a) do
begin
e:=a[j];
f:=b[j];
g:=e;
if e<f then
g:=f;
c[j]:=g;
end;
end;
begin
p;
end.
$ time ./tbench1
real 0m0.752s
user 0m0.748s
sys 0m0.004s
$ cat tbench2.pp
procedure p;
var
a,b,c : array[0..100] of longint;
i,j,e,f,g : longint;
begin
for j:=low(a) to high(a) do
begin
a[j]:=random(10);
b[j]:=random(10);
end;
for i:=1 to 10000000 do
for j:=low(a) to high(a) do
begin
e:=a[j];
f:=b[j];
if e<f then
g:=f
else
g:=e;
c[j]:=g;
end;
end;
begin
p;
end.
$ time ./tbench2
real 0m0.997s
user 0m0.997s
sys 0m0.000s
_______________________________________________
fpc-devel maillist - [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel