When #pragma GCC unroll is processed in tree-cfg.cc:replace_loop_annotate_in_block, we set both the loop->unroll field (which is currently streamed out and back in during LTO) but also the cfun->has_unroll flag.
cfun->has_unroll, however, is not currently streamed during LTO, so this patch attempts to recover it by setting it on any function containing a loop with loop->unroll > 1. Prior to this patch, loops marked with #pragma GCC unroll that would be unrolled by RTL loop2_unroll in a non-LTO compilation didn't get unrolled under LTO. As per the comment in the PR, a more conservative fix might explicitly stream out cfun->has_unroll and stream it back in again, but this patch it simpler and I can't currently see a reason against inferring the value of the flag like this (comments welcome). gcc/ChangeLog: PR libstdc++/116140 * lto-streamer-in.cc (input_cfg): Set fn->has_unroll if fn contains a loop with requested unrolling. gcc/testsuite/ChangeLog: PR libstdc++/116140 * g++.dg/ext/pragma-unroll-lambda-lto.C: New test. --- gcc/lto-streamer-in.cc | 2 ++ .../g++.dg/ext/pragma-unroll-lambda-lto.C | 32 +++++++++++++++++++ 2 files changed, 34 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc index 2e592be8082..93877065d86 100644 --- a/gcc/lto-streamer-in.cc +++ b/gcc/lto-streamer-in.cc @@ -1136,6 +1136,8 @@ input_cfg (class lto_input_block *ib, class data_in *data_in, /* Read OMP SIMD related info. */ loop->safelen = streamer_read_hwi (ib); loop->unroll = streamer_read_hwi (ib); + if (loop->unroll > 1) + fn->has_unroll = true; loop->owned_clique = streamer_read_hwi (ib); loop->dont_vectorize = streamer_read_hwi (ib); loop->force_vectorize = streamer_read_hwi (ib); diff --git a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C new file mode 100644 index 00000000000..144c4c32692 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C @@ -0,0 +1,32 @@ +// { dg-do link { target c++11 } } +// { dg-options "-O2 -flto -fdump-rtl-loop2_unroll" } + +#include <cstdlib> + +template<typename Iter, typename Pred> +inline Iter +my_find(Iter first, Iter last, Pred pred) +{ +#pragma GCC unroll 4 + while (first != last && !pred(*first)) + ++first; + return first; +} + +__attribute__((noipa)) +short *use_find(short *p) +{ + auto pred = [](short x) { return x == 42; }; + return my_find(p, p + 1024, pred); +} + +int main(void) +{ + short a[1024]; + for (int i = 0; i < 1024; i++) + a[i] = rand (); + + return use_find (a) - a; +} + +// { dg-final { scan-ltrans-rtl-dump-times "Unrolled loop 3 times" 1 "loop2_unroll" } }