https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93770
Bug ID: 93770 Summary: std::tuple::operator< sometimes does needless extra work Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gnu at kosak dot com Target Milestone: --- Hello, In the simple program below, tuple<Foo>::operator< calls the underlying Foo::operator< twice (10 < 9, then 9 < 10), even though only the first call is needed to get the correct answer. The same thing can happen with tuples of larger dimension: if the prefix (all but the final element) compares equal, but the final element compares greater-or-equal, we will do two calls to operator< on the final element even though only the first one is needed to get the right answer. I have an example, and a suggested fix. My example is a tuple of dimension 1: $ g++ -O2 qwe.cc && ./a.out Calculating 10 < 9 Calculating 9 < 10 !(t1 < t2) Source: =============================== #include <iostream> #include <tuple> struct Foo { explicit Foo(int data) : data_(data) {} int data_; }; bool operator<(const Foo &lhs, const Foo &rhs) { std::cout << "Calculating " << lhs.data_ << " < " << rhs.data_ << std::endl; return lhs.data_ < rhs.data_; } int main() { auto t1 = std::make_tuple(Foo(10)); auto t2 = std::make_tuple(Foo(9)); if (t1 < t2) { std::cout << "t1 < t2" << std::endl; } else { std::cout << "!(t1 < t2)" << std::endl; } } ======================== The culprit is this code in /usr/include/c++/9/tuple:L1388 template<typename _Tp, typename _Up, size_t __i, size_t __size> struct __tuple_compare { ... static constexpr bool __less(const _Tp& __t, const _Up& __u) { return bool(std::get<__i>(__t) < std::get<__i>(__u)) || (!bool(std::get<__i>(__u) < std::get<__i>(__t)) && __tuple_compare<_Tp, _Up, __i + 1, __size>::__less(__t, __u)); } }; The extra work happens at the final step of the recursion, when (__i + 1 == __size). In this case, due to template specialization, the right side of the && (namely __tuple_compare<_Tp, _Up, __i + 1, __size>::__less(__t, __u)) evaluates to the constant false regardless of __t and __u. Therefore the && is always false, and the left side of the && (namely !bool(std::get<__i>(__u) < std::get<__i>(__t)) is wasted effort. One easy fix would be to test for the base case explicitly, as in static constexpr bool __less(const _Tp& __t, const _Up& __u) { return bool(std::get<__i>(__t) < std::get<__i>(__u)) || (__i + 1 != __size // *** THIS LINE IS NEW *** && !bool(std::get<__i>(__u) < std::get<__i>(__t)) && __tuple_compare<_Tp, _Up, __i + 1, __size>::__less(__t, __u)); } $ g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:hsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.2.1-9ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2)