Dear all,

As I work through handling load multiples and store multiples for my
target architecture, I came in front of this scenario:

int foo(int data[10240])
{
    int w0, w1;
    int part = 0,i ;

    for (i=0; i<10000;i+=2){
        w0 = data[i];
        w1 = data[i+1];
        part += w0 + w1 ;
    }

    return part;
}


int bar(void)
{
    int w0, w1;
    int part = 0,i ;
    int data[10240];

    fillit (data);

    for (i=0; i<10000;i+=2){
        w0 = data[i];
        w1 = data[i+1];
        part += w0 + w1 ;
    }

    return part;
}


In the case where data is defined as a parameter, I get two separate
loads with no offsets. GCC handles both loads separately. This means
more register pressure and more instructions in the loop. In essence I
get:

ld r1, (0)r5
ld r2, (0)r6

Instead of :

ld r1, (0)r5
ld r2, (8)r6

which I do get if the array is declared in the function.

I have written the address_cost function to handle this case and tell
the compiler that it is better to use the offsets but in this
scenario, the compiler is not calling the address cost function before
already getting those two different registers. So I don't why it's
already expanding those loads before getting to that call

Any ideas, thanks?
Jc

Reply via email to