The following fixes wrong-code when using outer loop vectorization
and an inner loop SLP access with permutation. A wrong adjustment
to the IV increment is then applied on GCN.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
PR tree-optimization/115640
* tree-vect-stmts.cc (vectorizable_load): With an inner
loop SLP access to not apply a gap adjustment.
---
gcc/tree-vect-stmts.cc | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 1fa92a0dc13..9697b8ca39c 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -10597,9 +10597,14 @@ vectorizable_load (vec_info *vinfo,
whole group, not only the number of vector stmts the
permutation result fits in. */
unsigned scalar_lanes = SLP_TREE_LANES (slp_node);
- if (slp_perm
- && (group_size != scalar_lanes
- || !multiple_p (nunits, group_size)))
+ if (nested_in_vect_loop)
+ /* We do not support grouped accesses in a nested loop,
+ instead the access is contiguous but it might be
+ permuted. No gap adjustment is needed though. */
+ vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
+ else if (slp_perm
+ && (group_size != scalar_lanes
+ || !multiple_p (nunits, group_size)))
{
/* We don't yet generate such SLP_TREE_LOAD_PERMUTATIONs for
variable VF; see vect_transform_slp_perm_load. */
--
2.35.3