https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110879
Bug ID: 110879 Summary: Unnecessary reread from memory in a loop Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: palevichva at gmail dot com Target Milestone: --- Created attachment 55678 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55678&action=edit preprocessed file by g++ from revision dd2eb972a I've found a strange regression in optimization. Trunk version of g++ produces less optimal assembly. It rereads same memory location in every iteration of a loop. More specifically, it rereads fields _M_finish and _M_end_of_storage of a vector from memory every push_back call, although it is not necessary. Released version 13.2 doesn't do that, and just uses values from registers. I'm compiling following code: #include <vector> std::vector<int> f(std::size_t n) { std::vector<int> res; res.reserve(n); for (std::size_t i = 0; i < n; ++i) { res.push_back(i*i); } return res; } The main body of a loop looks like this: ~/.local/gcc/bin/g++ -S -fverbose-asm -O3 -std=c++20 pb.cpp >.L41: ># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/stl_construct.h:97: { >return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); } > movl %r15d, (%rbx) # _3, *prephitmp_51 ># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:119: >++this->_M_impl._M_finish; > addq $4, %rbx #, tmp135 > movq %rbx, 8(%rbp) # tmp135, > res_8(D)->D.35756._M_impl.D.35067._M_finish >.L8: ># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) { > addq $1, %r13 #, i ># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) { > cmpq %r13, %r12 # i, n > je .L1 #, ># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if >(this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) > movq 8(%rbp), %rbx # res_8(D)->D.35756._M_impl.D.35067._M_finish, > prephitmp_51 ># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if >(this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) > movq 16(%rbp), %rax # > res_8(D)->D.35756._M_impl.D.35067._M_end_of_storage, pretmp_52 >.L16: ># pb.cpp:7: res.push_back(i*i); > movl %r13d, %r15d # i, _3 > imull %r13d, %r15d # i, _3 ># /home/scaiper/.local/gcc/include/c++/14.0.0/bits/vector.tcc:114: if >(this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) > cmpq %rax, %rbx # pretmp_52, prephitmp_51 > jne .L41 #, Same loop as produced by 13.2: ~/.local/gcc-13.2/bin/g++ -v -S -fverbose-asm -O3 -std=c++20 pb.cpp >.L43: ># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/stl_construct.h:97: > { return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); } > movl %r12d, (%rcx) # _3, *prephitmp_4 ># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/vector.tcc:119: > ++this->_M_impl._M_finish; > addq $4, %rcx #, prephitmp_4 > movq %rcx, 8(%rbp) # prephitmp_4, > res_8(D)->D.35699._M_impl.D.35010._M_finish >.L8: ># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) { > addq $1, %rbx #, i ># pb.cpp:6: for (std::size_t i = 0; i < n; ++i) { > cmpq %rbx, %r13 # i, n > je .L1 #, >.L18: ># pb.cpp:7: res.push_back(i*i); > movl %ebx, %r12d # i, _3 > imull %ebx, %r12d # i, _3 ># /home/scaiper/.local/gcc-13.2/include/c++/13.2.0/bits/vector.tcc:114: > if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage) > cmpq %r8, %rcx # prephitmp_74, prephitmp_4 > jne .L43 #, Notice this extra commands in the first snippet: movq 8(%rbp), %rbx movq 16(%rbp), %rax I've bisected this problem to the commit dd2eb972a (libstdc++: Use RAII in std::vector::_M_realloc_insert) (https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=dd2eb972a5b063e10c83878d5c9336a818fa8291). It doesn't look like commit is the problem. Code looks pretty equivalent. But for some reason compiler produces different result. I'm using version built from aforementioned commit dd2eb972a: Target: x86_64-pc-linux-gnu Configured with: ../gcc/configure --enable-languages=c++ --disable-multilib --prefix=/home/scaiper/.local/gcc gcc version 14.0.0 20230623 (experimental) (GCC) COLLECT_GCC_OPTIONS='-v' '-S' '-fverbose-asm' '-O3' '-std=c++20' '-shared-libgcc' '-mtune=generic' '-march=x86-64' Comparing with 13.2: Target: x86_64-pc-linux-gnu Configured with: ../gcc/configure --enable-languages=c++ --disable-multilib --prefix=/home/scaiper/.local/gcc-13.2 gcc version 13.2.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-S' '-fverbose-asm' '-O3' '-std=c++20' '-shared-libgcc' '-mtune=generic' '-march=x86-64'