I don't understand how synth-mult works, but it does introduce
multiple uses of a reduction variable which will ultimatively
fail vectorization (or ICE with a pending change).  So avoid
applying the pattern.  I've tried to do so selectively, possibly
preserving pattern-matching x * 4 as x << 2.

So basically a single replacement stmt should be OK, likewise sth like

 tem = -x;
 tem = tem << 3;
 res = -tem;

so a single use of 'x' remains.   Even using x + x for x * 2 when
x << 1 isn't possible does not work.  So if only pow2 mults will
work I can probably make a condition on that.  Will we synthesize
any more complex appropriate chain?

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any comments?

Thanks,
Richard.

        * tree-vect-patterns.cc (vect_synth_mult_by_constant): Avoid
        in cases that introduce multiple uses of reduction operands.
---
 gcc/tree-vect-patterns.cc | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 3fffcac4b3a..fae4b393dff 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -4303,6 +4303,10 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op, 
tree val,
   /* Targets that don't support vector shifts but support vector additions
      can synthesize shifts that way.  */
   bool synth_shift_p = !vect_supportable_shift (vinfo, LSHIFT_EXPR, multtype);
+  if (synth_shift_p
+      /* Any multiple use of the reduction operand will break it.  */
+      && vect_is_reduction (stmt_vinfo))
+    return NULL;
 
   HOST_WIDE_INT hwval = tree_to_shwi (val);
   /* Use MAX_COST here as we don't want to limit the sequence on rtx costs.
@@ -4333,7 +4337,12 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op, 
tree val,
   if (alg.op[0] == alg_zero)
     accumulator = build_int_cst (multtype, 0);
   else
-    accumulator = op;
+    {
+      /* Any multiple use of the reduction operand will break it.  */
+      if (vect_is_reduction (stmt_vinfo))
+       return NULL;
+      accumulator = op;
+    }
 
   bool needs_fixup = (variant == negate_variant)
                      || (variant == add_variant);
-- 
2.43.0

Reply via email to