Compile the following code with options -march=armv7-a -mthumb -Os typedef struct { int buf[7]; } A;
A foo(); void hahaha(A* p) { *p = foo(); } GCC generates: hahaha: push {r4, r5, lr} sub sp, sp, #36 mov r5, sp mov r4, r0 mov r0, sp bl foo ldmia r5!, {r0, r1, r2, r3} stmia r4!, {r0, r1, r2, r3} ldmia r5, {r0, r1, r2} stmia r4, {r0, r1, r2} add sp, sp, #36 pop {r4, r5, pc} GCC first allocates temporary memory on the stack and passes its address into function foo, function foo will return the new struct in this memory area. After function return, gcc copies the contents of temporary memory into another area pointed to by pointer p. Actually we can simply pass the pointer p into function foo, then we get hahaha: /* pointer p is already in register r0 */ /* we can also apply tail function call optimization here. */ b foo Any combination of arm/thumb Os/O2 generates similar results. The inefficient code is generated at expand pass. This may also affect other targets with similar ABI that needs temporary memory and an extra parameter to return large object. Following is another similar case. typedef struct { int buf[7]; } A; A foo(); void bar(A*); void hahaha(A* p) { A t; t = foo(); bar(&t); } -- Summary: Inefficient code to return a large struct Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: carrot at google dot com GCC build triplet: i686-linux GCC host triplet: i686-linux GCC target triplet: arm-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44675