Hi,
I wrote a test code like this:
void foo(int * a)
{
a[0] = 0xfafafafb;
a[1] = 0xfafafafc;
a[2] = 0xfafafafe;
a[3] = 0xfafafaff;
a[4] = 0xfafafaf0;
a[5] = 0xfafafaf1;
a[6] = 0xfafafaf2;
a[7] = 0xfafafaf3;
a[8] = 0xfafafaf4;
a[9] = 0xfafafaf5;
a[10] = 0xfafafaf6;
a[11] = 0xfafafaf7;
a[12] = 0xfafafaf8;
a[13] = 0xfafafaf9;
a[14] = 0xfafafafa;
a[15] = 0xfafaf0fa;
}
that was what gcc generated:
movl $-84215045, (%rdi)
movl $-84215044, 4(%rdi)
movl $-84215042, 8(%rdi)
movl $-84215041, 12(%rdi)
movl $-84215056, 16(%rdi)
...
that was what LLVM/clang generated:
movabsq $-361700855600448773, %rax # imm = 0xFAFAFAFCFAFAFAFB
movq %rax, (%rdi)
movabsq $-361700842715546882, %rax # imm = 0xFAFAFAFFFAFAFAFE
movq %rax, 8(%rdi)
movabsq $-361700902845089040, %rax # imm = 0xFAFAFAF1FAFAFAF0
movq %rax, 16(%rdi)
movabsq $-361700894255154446, %rax # imm = 0xFAFAFAF3FAFAFAF2
...
I ran the code on my i7 machine for 10000000000 times.Here was the result:
gcc:
real 0m50.613s
user 0m50.559s
sys 0m0.000s
LLVM/clang:
real 0m32.036s
user 0m32.001s
sys 0m0.000s
That mean movabsq did do a better job!
Should gcc peephole pass add such a combine?
--
Regards
lin zuojian