https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108823

            Bug ID: 108823
           Summary: ranges::transform could be smarter with two sized
                    ranges
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

>From StackOverflow (https://stackoverflow.com/q/75464599/2069064):

#include <algorithm>
#include <ranges>
#include <vector>

#include <algorithm>
#include <ranges>
#include <vector>

std::vector<int> fn1(std::vector<int> u, std::vector<int> const& v) {
    #ifdef RANGES
    std::ranges::transform(u, v, u.begin(), std::plus<int>{});
    #else
    std::transform(u.begin(), u.end(), v.begin(), u.begin(), std::plus<int>{});
    #endif
    return u;
}

Without RANGES defined, this generates vectorized code. With RANGES, it does
not. These aren't exactly identical, since without RANGES the function is UB if
u.size() > v.size() while with RANGES it's fine - but the RANGES implementation
is still suboptimal.

Rewriting the RANGES impl to:

    auto sz = std::min(u.size(), v.size());
    std::ranges::transform(
        std::ranges::subrange(u.begin(), u.begin() + sz),
        std::ranges::subrange(v.begin(), std::unreachable_sentinel),
        u.begin(),
        std::plus<int>{});

gets the code to vectorize again. This is probably because the loop is simply:

»···for (; __first1 != __last1 && __first2 != __last2;
»···     ++__first1, (void)++__first2, ++__result)

The two conditions probably throws the optimizer, where if the algorithm is
written as a single condition (as the rewrite reduces to, since __first2 !=
__last2 becomes true), it's easier to optimize.
  • [Bug libstdc++/108823] New: ran... barry.revzin at gmail dot com via Gcc-bugs

Reply via email to