IMHO, current save/restore registers strategy is not optimal. Look:
# cat test.c
#include <stdio.h>
void print(char *mess, char *format, int text)
{
printf(mess);
printf(format,text);
}
void main()
{
print("X=","%d\n",1);
}
# gcc --version
gcc (GCC) 4.5.0 20090601 (experimental)
# gcc -o test test.c -O2
# objdump -d test
00000000004004d0 <print>:
4004d0: 48 89 5c 24 f0 mov %rbx,-0x10(%rsp) <----
4004d5: 48 89 6c 24 f8 mov %rbp,-0x8(%rsp) <----
4004da: 48 89 f3 mov %rsi,%rbx
4004dd: 48 83 ec 18 sub $0x18,%rsp <----
4004e1: 89 d5 mov %edx,%ebp
4004e3: 31 c0 xor %eax,%eax
4004e5: e8 ce fe ff ff callq 4003b8 <pri...@plt>
4004ea: 89 ee mov %ebp,%esi
4004ec: 48 89 df mov %rbx,%rdi
4004ef: 48 8b 6c 24 10 mov 0x10(%rsp),%rbp <----
4004f4: 48 8b 5c 24 08 mov 0x8(%rsp),%rbx <----
4004f9: 31 c0 xor %eax,%eax
4004fb: 48 83 c4 18 add $0x18,%rsp <----
4004ff: e9 b4 fe ff ff jmpq 4003b8 <pri...@plt>
=========
Let's replace current save/restore:
48 89 5c 24 f0 mov %rbx,-0x10(%rsp)
48 89 6c 24 f8 mov %rbp,-0x8(%rsp)
48 83 ec 18 sub $0x18,%rsp
...
48 8b 6c 24 10 mov 0x10(%rsp),%rbp
48 8b 5c 24 08 mov 0x8(%rsp),%rbx
48 83 c4 18 add $0x18,%rsp
to faster and short new save/restore:
55 push %rbp
53 push %rbx
53 push %rbx ; dummy push
...
5b pop %rbx ; dummy pop
5b pop %rbx
5d pop %rbp
IMPOTANT note: For faster execution, "dummy push" have to use same register as
previous push!
Measurement results on Core2: new save/restore 5 ticks faster then carrent one.
Regards,
Vladimir Volynsky
--
Summary: Nonoptimal save/restore registers
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: vvv at ru dot ru
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40363