There are some SVE testsuite fails when forcing SLP because costing
prevents VLA vectors from being used as we add permute cost for
the VEC_PERM nodes that are part of a SLP load-lanes node.  The
permutes only exist for representational reasons and pessimize SLP
vs non-SLP so the following makes sure to cost them as zero.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

        * tree-vect-slp.cc (vectorizable_slp_permutation_1): Return
        zero for the permute nodes part of load-lanes.
---
 gcc/tree-vect-slp.cc | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 38f4c41b8db..1240dac3d62 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -10567,13 +10567,15 @@ vectorizable_slp_permutation_1 (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   /* Load-lanes permute.  This permute only acts as a forwarder to
      select the correct vector def of the load-lanes load which
      has the permuted vectors in its vector defs like
-     { v0, w0, r0, v1, w1, r1 ... } for a ld3.  */
+     { v0, w0, r0, v1, w1, r1 ... } for a ld3.  All costs are
+     accounted for in the costing for the actual load so we
+     return zero here.  */
   if (node->ldst_lanes)
     {
       gcc_assert (children.length () == 1);
       if (!gsi)
        /* This is a trivial op always supported.  */
-       return 1;
+       return 0;
       slp_tree child = children[0];
       unsigned vec_idx = (SLP_TREE_LANE_PERMUTATION (node)[0].second
                          / SLP_TREE_LANES (node));
@@ -10583,7 +10585,7 @@ vectorizable_slp_permutation_1 (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
          tree def = SLP_TREE_VEC_DEFS (child)[i * vec_num  + vec_idx];
          node->push_vec_def (def);
        }
-      return 1;
+      return 0;
     }
 
   /* Set REPEATING_P to true if the permutations are cylical wrt UNPACK_FACTOR
-- 
2.43.0

Reply via email to