https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104719
Bug ID: 104719 Summary: Use of `std::move` in libstdc++ leads to worsened debug performance Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: vittorio.romeo at outlook dot com Target Milestone: --- `std::accumulate` is defined as follows in `libstdc++`: ``` template<typename _InputIterator, typename _Tp> _GLIBCXX20_CONSTEXPR inline _Tp accumulate(_InputIterator __first, _InputIterator __last, _Tp __init) { // concept requirements __glibcxx_function_requires(_InputIteratorConcept<_InputIterator>) __glibcxx_requires_valid_range(__first, __last); for (; __first != __last; ++__first) __init = _GLIBCXX_MOVE_IF_20(__init) + *__first; return __init; } ``` Where `_GLIBCXX_MOVE_IF_20` is: ``` #if __cplusplus > 201703L // _GLIBCXX_RESOLVE_LIB_DEFECTS // DR 2055. std::move in std::accumulate and other algorithms # define _GLIBCXX_MOVE_IF_20(_E) std::move(_E) #else # define _GLIBCXX_MOVE_IF_20(_E) _E #endif ``` When compiling a program using `std::accumulate` in debug mode, under `-Og`, there is a noticeable performance impact due to the presence of `std::move`. - With `std::move`: https://quick-bench.com/q/h_M_AUs3pgBE3bYr82rsA1_VtjU - Without `std::move`: https://quick-bench.com/q/ysis2b1CgIZkRsO2cqfjZm9Jkio This performance degradation is one example of why many people (especially in the gamedev community) are not adopting standard library algorithms and modern C++ more widely. Would it be possible to replace `std::move` calls internal to `libstdc++` with a cast, or some sort of compiler intrinsic? Or maybe mark `std::move` as "always inline" even without optimizations enabled? Related issue for libc++: https://github.com/llvm/llvm-project/issues/53689