https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097
Bug ID: 115097 Summary: Strange suboptimal codegen specifically at -O2 when copying struct type Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: arthur.j.odwyer at gmail dot com Target Milestone: --- // https://godbolt.org/z/G7qG4vvWb (C++ version) // https://godbolt.org/z/fT793cznT (C version) struct A { int a; short b; }; A test1(A& a) { return a; } A test2(A&& a) { return a; } A test3(const A& a) { return a; } A test4(const A&& a) { return a; } At -O1, they all have the same perfect codegen: test2(A&&): mov rax, QWORD PTR [rdi] ret At -O2, all-but-one of them have weird suboptimal (but correct) codegen: test2(A&&): movzx edx, WORD PTR [rdi+4] mov eax, DWORD PTR [rdi] sal rdx, 32 or rax, rdx ret What's really weird is that it really is "all but one of them." The lexically first function will have good codegen, and then the subsequent ones will have the suboptimal codegen. You can comment out the good one and watch GCC pick another one to make good. You can reorder the definitions and see GCC's decision of which one to optimize will change. This behavior dates all the way back to GCC 5. In GCC 4.9.4, -O2 didn't have this weird behavior; all four functions would just be optimal all the time. The same symptom reproduces on x86-64, ARM64, and RISC-V (32 *and* 64). The RISC-V result seems to indicate it's not specifically limited to 64-bit. It also reproduces in C, with pointers instead of references. It seems to have something to do with the signature of the enclosing function, which is super weird: https://godbolt.org/z/KPac7bvTM