-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello!
I just tested the prerelease of gcc 4.0 (to see whether my programs
still compile and work), and I must say: Congratulations, no real
problems so far.
But I noticed some smaller optimization issues on x86, and on of them is
a regression to gcc 3.3 so I'm reporting this here. Accept my apologies
if this is already known, but I think it's worth noting.
So, for the real stuff, take this c program:
=============================== example1.c ===================
#include <stdio.h>
void test()
{
int i;
for (i=10; i>=0; i--) {
printf("%d\n", i);
}
}
int main()
{
test();
return 0;
}
===============================
Everthing was compiled using:
gcc -S -O3 -fomit-frame-pointer -o output input
gcc 3.3 compiles the test() function to the following x86 assembler:
===============================
test:
pushl %ebx
subl $8, %esp
movl $10, %ebx
.p2align 4,,15
.L30:
movl %ebx, 4(%esp)
movl $.LC0, (%esp)
call printf
decl %ebx
jns .L30
addl $8, %esp
popl %ebx
ret
===============================
I guess that can't be improved.
But gcc 4.0 thinks so! It compiles the very same code to
===============================
test:
pushl %esi
movl $-1, %esi [1]
pushl %ebx
movl $10, %ebx
subl $20, %esp [2]
.p2align 4,,15
.L2:
movl %ebx, 4(%esp)
decl %ebx
movl $.LC0, (%esp)
call printf
cmpl %ebx, %esi [3]
jne .L2
addl $20, %esp
popl %ebx
popl %esi
ret
===============================
[1] Why keep the -1 constant in %esi? The cmpl with constant is only 1
byte longer.. this doesn't justify this.
[2] It's allocating 5 words on stack while 2 would be enough. I know
that gcc isn't very smart at optimizing the stack slots but this is a
regression
[3] Why use the cmpl at all? gcc 3.3 did this right, I don't think the
cmpl is faster than a decl (and even then, the cmpl could be replaced by
a "subl $1, %ebx")
NB: When gcc inlines this function, it will be compiled to
===============================
main:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl $10, %ebx
subl $20, %esp
andl $-16, %esp
subl $16, %esp
.p2align 4,,15
.L9:
movl %ebx, 4(%esp)
decl %ebx
movl $.LC0, (%esp)
call printf
cmpl $-1, %ebx <-----
jne .L9
movl -4(%ebp), %ebx
xorl %eax, %eax
leave
ret
==============================
(test() is inlined in main())
As you can see, now gcc doesn't use a register for the -1 constant.
Quite odd I think.
**********************************************************************
Now for a second example:
============================== example2.c ===================
#include <stdio.h>
int i;
void test()
{
for (i=10; i>=0; i--) {
printf("%d\n", i);
}
}
int main()
{
test();
return 0;
}
==============================
This is roughly the same as example 1, but "i" is now a global variable.
We can directly take a look on how gcc-4.0 compiles this, because
gcc-3.3 does almost the same:
==============================
test:
movl $10, %eax [2]
subl $12, %esp [1]
movl %eax, i
movl $10, %eax [2]
.p2align 4,,15
.L2:
movl %eax, 4(%esp)
movl $.LC0, (%esp)
call printf
movl i, %eax
decl %eax [3]
testl %eax, %eax [3]
movl %eax, i
jns .L2
addl $12, %esp
ret
==============================
[1] Again, the wasted stack. gcc-3.3 doesn't get this right, too.
[2] Even a peephole optimizer could optimize this :)
[3] The testl is unneeded, the flags are already prepared by the decl.
Is this a hard optimization to accomplish? It's quite obvious for a
human, but I don't know how this looks from a compiler perspective...
Sebastian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iQEVAwUBQme5/f81M8QtvOSJAQLRGggAnpufAt1xuImGpsw0aTk/gCD+FmGUq2LR
3mPPX+E0zCbJCVfyuJl45j0fbyjhrEpqKdQ+rkpUhvBpC/BN2kO3clDZktHczMuq
WjjPQxbcBGX1jSvGQVS5bfgXIaeYRF5V9quzm3N4c0hXSsPHlwHCa4jbAQxCqdly
8XH9wzCUyjpfxDKG4zSzAS5DUg/hdAbBCekLBAjTSZhCqr1XmZJ5SmNIu9ZH0anU
rMDYaZPFJ4Cq291xON4R1g5enSnwkdlxh6zGmtvsXwY+KbJW1Tpq5q80lSjx7RUF
P5IZsvoqOzdV6PvUBhqft/w1xCRWn/11bgyuAfJ3Wna8j3IXeJHoiA==
=5WkM
-----END PGP SIGNATURE-----