https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65847

            Bug ID: 65847
           Summary: SSE2 code for adding two structs is much worse at -O3
                    than at -O2
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jay.foad at gmail dot com

On x86_64 I get decent code at -O2:

$ cat zplus.c
typedef struct { double a, b; } Z;
Z zplus(Z x, Z y) { return (Z){ x.a + y.a, x.b + y.b }; }
$ gcc -O2 -S -o - zplus.c
...
zplus:
.LFB0:
        .cfi_startproc
        addsd   %xmm3, %xmm1
        addsd   %xmm2, %xmm0
        ret
        .cfi_endproc
.LFE0:
...

but awful code at -O3:

$ gcc -O3 -S -o - zplus.c
...
zplus:
.LFB0:
        .cfi_startproc
        movq    %xmm0, -40(%rsp)
        movq    %xmm1, -32(%rsp)
        movq    %xmm2, -56(%rsp)
        movq    %xmm3, -48(%rsp)
        movupd  -40(%rsp), %xmm1
        movupd  -56(%rsp), %xmm0
        addpd   %xmm0, %xmm1
        movaps  %xmm1, -72(%rsp)
        movq    -72(%rsp), %rdx
        movq    -64(%rsp), %rax
        movq    %rdx, -72(%rsp)
        movsd   -72(%rsp), %xmm0
        movq    %rax, -72(%rsp)
        movsd   -72(%rsp), %xmm1
        ret
...

I see similar bad code generated by various versions of GCC, starting around
version 4.8:
gcc-4.8 (Ubuntu 4.8.3-12ubuntu3) 4.8.3
gcc (Ubuntu 4.9.1-16ubuntu6) 4.9.1
gcc (GCC) 6.0.0 20150422 (experimental)

Reply via email to