The following fixes wrong-code when using outer loop vectorization and an inner loop SLP access with permutation. A wrong adjustment to the IV increment is then applied on GCN.
Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115640 * tree-vect-stmts.cc (vectorizable_load): With an inner loop SLP access to not apply a gap adjustment. --- gcc/tree-vect-stmts.cc | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 1fa92a0dc13..9697b8ca39c 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10597,9 +10597,14 @@ vectorizable_load (vec_info *vinfo, whole group, not only the number of vector stmts the permutation result fits in. */ unsigned scalar_lanes = SLP_TREE_LANES (slp_node); - if (slp_perm - && (group_size != scalar_lanes - || !multiple_p (nunits, group_size))) + if (nested_in_vect_loop) + /* We do not support grouped accesses in a nested loop, + instead the access is contiguous but it might be + permuted. No gap adjustment is needed though. */ + vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); + else if (slp_perm + && (group_size != scalar_lanes + || !multiple_p (nunits, group_size))) { /* We don't yet generate such SLP_TREE_LOAD_PERMUTATIONs for variable VF; see vect_transform_slp_perm_load. */ -- 2.35.3