> -----Original Message----- > From: Richard Biener <[email protected]> > Sent: 04 December 2025 07:56 > To: Tamar Christina <[email protected]> > Cc: [email protected]; nd <[email protected]>; [email protected] > Subject: Re: [PATCH][vect]: don't hoist conditional loads above their > condition > [PR122868] > > On Tue, 2 Dec 2025, Tamar Christina wrote: > > > The example in the PR > > > > #include <vector> > > > > std::vector<bool> x, y; > > int main() { return x == y; } > > > > now vectorizes but the attributes on std::vector indicate that the vector is > > aligned to the natural vector alignment. In C this is equivalent to the > > testcase > > > > int f (int a[12], int b[12], int n) > > { > > a = __builtin_assume_aligned (a, 16); > > b = __builtin_assume_aligned (b, 16); > > for (int i = 0; i < n; i++) > > { > > if (b[i] == 0) > > return 0; > > if (a[0] > b[i]) > > return 1; > > } > > return 2; > > } > > > > Here the load a[0] is loop invariant, and the vectorizer hoists this out of > > the > > loop into the pre-header. For early break this isn't safe to do as a[0] is > > conditionally valid based on the conditions in the block preceding it. As > > such > > we need some guarantee that the load is valid before we can hoist it or the > load > > needs to be unconditional (e.g. in the loop header block). > > > > Conceptually alignment peeling can provide this guarantee since making it > > through the prologue means the invariant value was loaded at least once > and so > > we know the address is valid. At the moment however there's no real > defined > > order between how GCC inserts conditions in the pre-header, so having tried > to > > change the order a few times the load always ends up before the prologue. > So > > for now I marked it as a missed optimization. > > > > Since we still can hoist invariant loads if in the header, I didn't change > > LOOP_VINFO_NO_DATA_DEPENDENCIES since that would be global and > instead I > > modified the usage site of LOOP_VINFO_NO_DATA_DEPENDENCIES. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > > -m32, -m64 and no issues. > > > > Pushed to master. > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > PR tree-optimization/122868 > > * tree-vect-stmts.cc (vectorizable_load): Don't hoist loop invariant > > conditional loads unless in header. > > > > gcc/testsuite/ChangeLog: > > > > PR tree-optimization/122868 > > * gcc.dg/vect/vect-early-break_140-pr122868_1.c: New test. > > * gcc.dg/vect/vect-early-break_140-pr122868_2.c: New test. > > * gcc.dg/vect/vect-early-break_140-pr122868_3.c: New test. > > * gcc.dg/vect/vect-early-break_140-pr122868_4.c: New test. > > > > --- > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_1.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_1.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..80264bd4f31c85d3ea > ca11430c7edeabcb635296 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_1.c > > @@ -0,0 +1,39 @@ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_sizes_16B_8B } */ > > +/* { dg-require-effective-target vect_early_break_hw } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ > > + > > +#include "tree-vect.h" > > + > > +__attribute__ ((noipa)) > > +int f (int a[12], int b[12], int n) > > +{ > > +#ifdef __arm__ > > + a = __builtin_assume_aligned (a, 8); > > + b = __builtin_assume_aligned (b, 8); > > +#else > > + a = __builtin_assume_aligned (a, 16); > > + b = __builtin_assume_aligned (b, 16); > > +#endif > > + for (int i = 0; i < n; i++) > > + { > > + if (b[i] == 0) > > + return 0; > > + if (a[0] > b[i]) > > + return 1; > > + } > > + return 2; > > +} > > + > > +int main () > > +{ > > + check_vect (); > > + > > + int *a = 0; > > + int b[12] = {0}; > > + return f (a, b, 10); > > +} > > + > > +/* { dg-final { scan-tree-dump "not hoisting invariant load due to early > break" "vect" } } */ > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_2.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_2.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..90222fcffd7c98a4187 > 053326cd6f88bfd2bcb63 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_2.c > > @@ -0,0 +1,31 @@ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_early_break_hw } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ > > + > > +#include "tree-vect.h" > > + > > +__attribute__ ((noipa)) > > +int f (int a[12], int b[12], int n) > > +{ > > + for (int i = 0; i < n; i++) > > + { > > + if (b[i] == 0) > > + return 0; > > + if (a[0] > b[i]) > > + return 1; > > + } > > + return 2; > > +} > > + > > +int main () > > +{ > > + check_vect (); > > + > > + int *a = 0; > > + int b[12] = {0}; > > + return f (a, b, 10); > > +} > > + > > +/* { dg-final { scan-tree-dump-times "not hoisting invariant load due to > early break" 0 "vect" { xfail *-*-* } } } */ > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_3.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_3.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..670804f8ce537a1381 > 714a44e4b1d42b66ed6b61 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_3.c > > @@ -0,0 +1,39 @@ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_sizes_16B_8B } */ > > +/* { dg-require-effective-target vect_early_break_hw } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ > > + > > +#include "tree-vect.h" > > + > > +__attribute__ ((noipa)) > > +int f (int a[12], int b[12], int n) > > +{ > > +#ifdef __arm__ > > + a = __builtin_assume_aligned (a, 8); > > + b = __builtin_assume_aligned (b, 8); > > +#else > > + a = __builtin_assume_aligned (a, 16); > > + b = __builtin_assume_aligned (b, 16); > > +#endif > > + for (int i = 0; i < n; i++) > > + { > > + if (a[0] > b[i]) > > + return 0; > > + if (b[i] == 0) > > + return 1; > > + } > > + return 2; > > +} > > + > > +int main () > > +{ > > + check_vect (); > > + > > + int a[12] = {1}; > > + int b[12] = {0}; > > + return f (a, b, 10); > > +} > > + > > +/* { dg-final { scan-tree-dump-times "not hoisting invariant load due to > early break" 0 "vect" } } */ > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_4.c > b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_4.c > > new file mode 100644 > > index > 0000000000000000000000000000000000000000..de2aff287f4fa146ef8c > b7e476f63a877e51fedf > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_140-pr122868_4.c > > @@ -0,0 +1,31 @@ > > +/* { dg-add-options vect_early_break } */ > > +/* { dg-require-effective-target vect_early_break_hw } */ > > +/* { dg-require-effective-target vect_int } */ > > + > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ > > + > > +#include "tree-vect.h" > > + > > +__attribute__ ((noipa)) > > +int f (int a[12], int b[12], int n) > > +{ > > + for (int i = 0; i < n; i++) > > + { > > + if (a[0] > b[i]) > > + return 0; > > + if (b[i] == 0) > > + return 0; > > + } > > + return 2; > > +} > > + > > +int main () > > +{ > > + check_vect (); > > + > > + int a[12] = {1}; > > + int b[12] = {0}; > > + return f (a, b, 10); > > +} > > + > > +/* { dg-final { scan-tree-dump-times "not hoisting invariant load due to > early break" 0 "vect" } } */ > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > > index > 1d7e50afcde1096d5598b43ab8d49454eb68385b..a47bbd3345b1e291d0d > 3ae571cf5666b66b02706 100644 > > --- a/gcc/tree-vect-stmts.cc > > +++ b/gcc/tree-vect-stmts.cc > > @@ -9880,6 +9880,34 @@ vectorizable_load (vec_info *vinfo, > > transform time. */ > > bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) > > && !nested_in_vect_loop); > > + > > + /* It is unsafe to hoist a conditional load over the conditions that > > make > > + it valid. When early break this means that any invariant load can't be > > + hoisted unless it's in the loop header or if we know something else > has > > + verified the load is valid to do. Alignment peeling would do this > > + since getting through the prologue means the load was done at least > > + once and so the vector main body is free to hoist it. However today > > + GCC will hoist the load above the PFA loop. As such that makes it > > + still invalid and so we can't allow it today. */ > > + auto stmt_bb > > + = gimple_bb (STMT_VINFO_STMT ( > > + vect_orig_stmt (SLP_TREE_SCALAR_STMTS > (slp_node)[0]))); > > + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo) > > + && !DR_SCALAR_KNOWN_BOUNDS (dr_info) > > + && stmt_bb != loop->header) > > + { > > + if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > > + && dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "not hoisting invariant load due to early break" > > + "constraints\n"); > > + else if (dump_enabled_p ()) > > + dump_printf_loc (MSG_NOTE, vect_location, > > + "not hoisting invariant load due to early break" > > + "constraints\n"); > > + hoist_p = false; > > + } > > + > > bool uniform_p = true; > > for (stmt_vec_info sinfo : SLP_TREE_SCALAR_STMTS (slp_node)) > > { > > I'd do the above in this loop as you are assuming that all invariant > loads in the group behave the same which isn't guaranteed, esp. > the DR_SCALAR_KNOWN_BOUNDS check which means dr_info is also not > the same across members in the SLP node.
Ack. I had thought about it when I added it but I convinced myself it was fine because I had expected the invariant loops to be hoisted as a group or not. I didn't take into account they could have different known offsets and that one could be out of bounds. Will update. Thanks, Tamar > > Richard. > > > > > > > > > -- > Richard Biener <[email protected]> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)
