Dear all, As I work through handling load multiples and store multiples for my target architecture, I came in front of this scenario:
int foo(int data[10240]) { int w0, w1; int part = 0,i ; for (i=0; i<10000;i+=2){ w0 = data[i]; w1 = data[i+1]; part += w0 + w1 ; } return part; } int bar(void) { int w0, w1; int part = 0,i ; int data[10240]; fillit (data); for (i=0; i<10000;i+=2){ w0 = data[i]; w1 = data[i+1]; part += w0 + w1 ; } return part; } In the case where data is defined as a parameter, I get two separate loads with no offsets. GCC handles both loads separately. This means more register pressure and more instructions in the loop. In essence I get: ld r1, (0)r5 ld r2, (0)r6 Instead of : ld r1, (0)r5 ld r2, (8)r6 which I do get if the array is declared in the function. I have written the address_cost function to handle this case and tell the compiler that it is better to use the offsets but in this scenario, the compiler is not calling the address cost function before already getting those two different registers. So I don't why it's already expanding those loads before getting to that call Any ideas, thanks? Jc