The compiler creates extremely bad code for the ARM target.
Consider the following source file:
--- SNIP ---
unsigned int code_in_ram[100];
void testme(void)
{
unsigned int *p_rom, *p_ram, *p_end, len;
extern unsigned int _ram_erase_sector_start;
extern unsigned int _ram_erase_sector_end;
p_ram = code_in_ram;
p_rom = &_ram_erase_sector_start;
len = ((unsigned int)&_ram_erase_sector_end
- (unsigned int)&_ram_erase_sector_start) / sizeof(unsigned int);
for (p_rom = &_ram_erase_sector_start, p_end = &_ram_erase_sector_end;
p_rom < p_end;) {
*p_ram++ = *p_rom++;
}
}
--- SNIP ---
Compiled with arm-elf-gcc -mcpu=arm7tdmi -S -Os testme.c, we get the following
code:
--- SNIP ---
.file "testme.c"
.text
.align 2
.global testme
.type testme, %function
testme:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r1, .L6
ldr r2, .L6+4
@ lr needed for prologue
b .L2
.L3:
ldr r3, [r1], #4
str r3, [r2, #-4]
.L2:
ldr r3, .L6+8
cmp r1, r3
add r2, r2, #4
bcc .L3
bx lr
.L7:
.align 2
.L6:
.word _ram_erase_sector_start
.word code_in_ram
.word _ram_erase_sector_end
.size testme, .-testme
.comm code_in_ram,400,4
.ident "GCC: (GNU) 4.1.0"
--- SNIP ---
Even a cursory examination reveals that it would be a lot better to write:
ldr r1, .L6
ldr r2, .L6+4
ldr r0, .L6+8
b .L2
.L3:
ldr r3, [r1], #4
str r3, [r2], #4
.L2:
cmp r1, r0
bcc .L3
bx lr
This code would be one instruction shorter overall , and two instructions less
in the loop. The way
gcc-4.1.0 refuses to use post-indexed addressing for the store is especially
bizzare, since it does use
post-indexed addressing for the preceeding load. Gcc 3.4.3 does not exhibit
this behaviour; it compiles
the above code to:
ldr r2, .L6
ldr r0, .L6+4
cmp r2,r0
ldr r1, .L6
movcs pc,lr
.L4:
ldr r2,[r2],#4
cmp r2, r0
str r3,[r1],#4
bcc .L4
mov pc,lr
While not perfect either, this also only has 4 instructions in the loop.
--
Summary: ARM optimizer produces severely suboptimal code
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: Eric dot Doenges at betty-tv dot com
GCC host triplet: powerpc-apple-darwin8.5.0
GCC target triplet: arm-elf-unknown
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27016