https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108823
Bug ID: 108823 Summary: ranges::transform could be smarter with two sized ranges Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: barry.revzin at gmail dot com Target Milestone: --- >From StackOverflow (https://stackoverflow.com/q/75464599/2069064): #include <algorithm> #include <ranges> #include <vector> #include <algorithm> #include <ranges> #include <vector> std::vector<int> fn1(std::vector<int> u, std::vector<int> const& v) { #ifdef RANGES std::ranges::transform(u, v, u.begin(), std::plus<int>{}); #else std::transform(u.begin(), u.end(), v.begin(), u.begin(), std::plus<int>{}); #endif return u; } Without RANGES defined, this generates vectorized code. With RANGES, it does not. These aren't exactly identical, since without RANGES the function is UB if u.size() > v.size() while with RANGES it's fine - but the RANGES implementation is still suboptimal. Rewriting the RANGES impl to: auto sz = std::min(u.size(), v.size()); std::ranges::transform( std::ranges::subrange(u.begin(), u.begin() + sz), std::ranges::subrange(v.begin(), std::unreachable_sentinel), u.begin(), std::plus<int>{}); gets the code to vectorize again. This is probably because the loop is simply: »···for (; __first1 != __last1 && __first2 != __last2; »··· ++__first1, (void)++__first2, ++__result) The two conditions probably throws the optimizer, where if the algorithm is written as a single condition (as the rewrite reduces to, since __first2 != __last2 becomes true), it's easier to optimize.