Jean Christophe Beyler writes:
As we can see, all three are using the symbol_ref data before adding
their offset. But after cse, we get this:
(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
(const:DI (plus:DI (symbol_ref:DI (data) var_decl 0xb7d35058 data)
(const_int 8
Ah ok, so I can see why it would not be able to perform that
optimization around the loop but I changed the code to simply have
this:
uint64_t foo (void)
{
return data[0] + data[1] + data[2];
}
And this generates :
la r9,data
la r7,data+8
ldd r6,0(r7)
ldd r8,0(r9)
ldd
Jean Christophe Beyler jean.christophe.bey...@gmail.com writes:
uint64_t foo (void)
{
return data[0] + data[1] + data[2];
}
And this generates :
la r9,data
la r7,data+8
ldd r6,0(r7)
ldd r8,0(r9)
ldd r7,16(r9)
I'm trying to see if there is a problem with my
The subreg pass has this :
(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
(const:DI (plus:DI (symbol_ref:DI (data) var_decl 0xb7d35058 data)
(const_int 8 [0x8] 71 {movdi_internal} (nil))
(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
(symbol_ref:DI (data) var_decl
Dear all,
As some might know, I've been concentrating on optimizing the handling
of loads for my port of GCC. I'm now considering this code:
uint64_t data[107];
uint64_t foo (void)
{
uint64_t x0, x1, x2, x3, x4, x5,x6,x7;
uint64_t i;
for(i=0;i107;i++) {
data[i] = i;
}
As you can see, the compiler uses r9 to store data and then uses that
for data[0] but also loads in r7 data+8 instead of directly using r9.
If I remove the loop then it does not do this.
This optimization is done by CSE only, currently. That's why it cannot
look through loops.
Paolo