Consider:

foo(int *x)
{
  *x = 0;
}

Compile with -Os -fomit-frame-pointer and you'll get something like this:

        movl    4(%esp), %eax
        movl    $0, (%eax)

It would be 2 bytes shorter to instead load the constant 0 via an xor
instruction into a scratch register, then store the scratch register into the
memory location.  Something like this:

        movl    4(%esp), %eax
        xor     %edx, %edx
        movl    %edx, (%eax)


ISTM this could easily be implemented with a peep2.

I'm not well versed enough in x86 instruction timings to know if the xor
sequence is going to generally be faster.


-- 
           Summary: GCC choosing poor code sequence for certain stores (x86)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: law at redhat dot com
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41505

Reply via email to