https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104719
Jason Merrill <jason at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jason at gcc dot gnu.org --- Comment #12 from Jason Merrill <jason at gcc dot gnu.org> --- (In reply to Vittorio Romeo from comment #9) > - For the `operator[]` benchmark, when using `-Og` after applying > `[[gnu::always_inline]]` to all the functions touched by the benchmark, we > reduce the overhead from 34% to around 11%. Quoting from your gist: -Og, without `[[gnu::always_inline]]` on `operator[]` carray_squareop_mean 440 ns 439 ns 3 vector_squareop_mean 661 ns 662 ns 3 -Og, with `[[gnu::always_inline]]` on `operator[]` vector_squareop_mean 494 ns 491 ns 3 Which looks significant... ...but I don't see this when I run your test myself; the vector_squareop results (and the generated code) are unaffected by adding always_inline. In particular there are no calls to operator[]. carray_squareop 547 ns 546 ns 1272023 vector_squareop 591 ns 589 ns 1227215 carray: movslq %edx, %rax leaq (%rbx,%rax,4), %rcx movl (%rcx), %eax addl $1, %eax movl %eax, (%rcx) movl %eax, (%rcx) addl $1, %edx vector: movslq %ecx, %rax salq $2, %rax addq (%rsp), %rax movl (%rax), %esi leal 1(%rsi), %edx movl %edx, (%rax) movl %edx, (%rax) addl $1, %ecx It seems the main difference between the two is that the vector version needs to keep loading the base pointer from the stack (%rsp) for some reason, rather than keep it in a register %rbx. This doesn't seem like an inlining issue at all. The double move in both versions is curious.