Consider: foo(int *x) { *x = 0; }
Compile with -Os -fomit-frame-pointer and you'll get something like this: movl 4(%esp), %eax movl $0, (%eax) It would be 2 bytes shorter to instead load the constant 0 via an xor instruction into a scratch register, then store the scratch register into the memory location. Something like this: movl 4(%esp), %eax xor %edx, %edx movl %edx, (%eax) ISTM this could easily be implemented with a peep2. I'm not well versed enough in x86 instruction timings to know if the xor sequence is going to generally be faster. -- Summary: GCC choosing poor code sequence for certain stores (x86) Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: law at redhat dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505