https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097

            Bug ID: 115097
           Summary: Strange suboptimal codegen specifically at -O2 when
                    copying struct type
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: arthur.j.odwyer at gmail dot com
  Target Milestone: ---

// https://godbolt.org/z/G7qG4vvWb (C++ version)
// https://godbolt.org/z/fT793cznT (C version)

struct A { int a; short b; };
A test1(A& a) { return a; }
A test2(A&& a) { return a; }
A test3(const A& a) { return a; }
A test4(const A&& a) { return a; }

At -O1, they all have the same perfect codegen:

    test2(A&&):
        mov     rax, QWORD PTR [rdi]
        ret

At -O2, all-but-one of them have weird suboptimal (but correct) codegen:

    test2(A&&):
        movzx   edx, WORD PTR [rdi+4]
        mov     eax, DWORD PTR [rdi]
        sal     rdx, 32
        or      rax, rdx
        ret

What's really weird is that it really is "all but one of them." The lexically
first function will have good codegen, and then the subsequent ones will have
the suboptimal codegen. You can comment out the good one and watch GCC pick
another one to make good. You can reorder the definitions and see GCC's
decision of which one to optimize will change.

This behavior dates all the way back to GCC 5. In GCC 4.9.4, -O2 didn't have
this weird behavior; all four functions would just be optimal all the time.

The same symptom reproduces on x86-64, ARM64, and RISC-V (32 *and* 64). The
RISC-V result seems to indicate it's not specifically limited to 64-bit.

It also reproduces in C, with pointers instead of references.
It seems to have something to do with the signature of the enclosing function,
which is super weird: https://godbolt.org/z/KPac7bvTM

Reply via email to