Compile the following code with options -march=armv7-a -mthumb -Os

typedef struct {
  int buf[7];
} A;

A foo();
void hahaha(A* p)
{
  *p = foo();
}

GCC generates:

hahaha:
        push    {r4, r5, lr}
        sub     sp, sp, #36
        mov     r5, sp
        mov     r4, r0
        mov     r0, sp
        bl      foo
        ldmia   r5!, {r0, r1, r2, r3}
        stmia   r4!, {r0, r1, r2, r3}
        ldmia   r5, {r0, r1, r2}
        stmia   r4, {r0, r1, r2}
        add     sp, sp, #36
        pop     {r4, r5, pc}

GCC first allocates temporary memory on the stack and passes its address into
function foo, function foo will return the new struct in this memory area.
After function return, gcc copies the contents of temporary memory into another
area pointed to by pointer p. Actually we can simply pass the pointer p into
function foo, then we get

hahaha:
        /* pointer p is already in register r0 */
        /* we can also apply tail function call optimization here. */
        b      foo

Any combination of arm/thumb Os/O2 generates similar results. The inefficient
code is generated at expand pass. This may also affect other targets with
similar ABI that needs temporary memory and an extra parameter to return large
object.

Following is another similar case.

typedef struct {
  int buf[7];
} A;

A foo();
void bar(A*);
void hahaha(A* p)
{
  A t;
  t = foo();
  bar(&t);
}


-- 
           Summary: Inefficient code to return a large struct
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: carrot at google dot com
 GCC build triplet: i686-linux
  GCC host triplet: i686-linux
GCC target triplet: arm-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44675

Reply via email to