On Wed, 27 Aug 2025, Jakub Jelinek wrote:

> On Wed, Aug 27, 2025 at 02:48:33PM +0200, Richard Biener wrote:
> > I don't understand how synth-mult works, but it does introduce
> > multiple uses of a reduction variable which will ultimatively
> > fail vectorization (or ICE with a pending change).  So avoid
> > applying the pattern.  I've tried to do so selectively, possibly
> > preserving pattern-matching x * 4 as x << 2.
> > 
> > So basically a single replacement stmt should be OK, likewise sth like
> > 
> >  tem = -x;
> >  tem = tem << 3;
> >  res = -tem;
> > 
> > so a single use of 'x' remains.   Even using x + x for x * 2 when
> > x << 1 isn't possible does not work.  So if only pow2 mults will
> > work I can probably make a condition on that.  Will we synthesize
> > any more complex appropriate chain?
> > 
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > 
> > Any comments?
> > 
> > Thanks,
> > Richard.
> > 
> >     * tree-vect-patterns.cc (vect_synth_mult_by_constant): Avoid
> >     in cases that introduce multiple uses of reduction operands.
> > ---
> >  gcc/tree-vect-patterns.cc | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > index 3fffcac4b3a..fae4b393dff 100644
> > --- a/gcc/tree-vect-patterns.cc
> > +++ b/gcc/tree-vect-patterns.cc
> > @@ -4303,6 +4303,10 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree 
> > op, tree val,
> >    /* Targets that don't support vector shifts but support vector additions
> >       can synthesize shifts that way.  */
> >    bool synth_shift_p = !vect_supportable_shift (vinfo, LSHIFT_EXPR, 
> > multtype);
> > +  if (synth_shift_p
> > +      /* Any multiple use of the reduction operand will break it.  */
> > +      && vect_is_reduction (stmt_vinfo))
> > +    return NULL;
> >  
> >    HOST_WIDE_INT hwval = tree_to_shwi (val);
> >    /* Use MAX_COST here as we don't want to limit the sequence on rtx costs.
> > @@ -4333,7 +4337,12 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree 
> > op, tree val,
> >    if (alg.op[0] == alg_zero)
> >      accumulator = build_int_cst (multtype, 0);
> >    else
> > -    accumulator = op;
> > +    {
> > +      /* Any multiple use of the reduction operand will break it.  */
> > +      if (vect_is_reduction (stmt_vinfo))
> > +   return NULL;
> > +      accumulator = op;
> > +    }
> >  
> >    bool needs_fixup = (variant == negate_variant)
> >                   || (variant == add_variant);
> 
> That seems to be too conservative to me.
> Wouldn't it be better instead of both chunks do something like:
>   if (vect_is_reduction (stmt_vinfo))
>     {
>       int op_uses = alg.op[0] != alg_zero;
>       for (int i = 1; i < alg.ops; i++)
>       switch (alg.op[i])
>         {
>         case alg_add_t_m2:
>         case alg_sub_t_m2:
>           if (synth_shift_p && alg.log[i])
>             return NULL;
>           else
>             op_uses++;
>           break;
>         case alg_add_t2_m:
>         case alg_sub_t2_m:
>           op_uses++;
>           break;
>         default:
>           break;
>         }
>       if (variant == add_variant)
>       op_uses++;
>       if (op_uses > 1)
>       return NULL;
>     }
> This basically counts how many times the later code will use
> the op operand (with synth_shift_p just returning NULL immediately
> because it is for alg.log[i] more than 1).

Ah, thanks for the write-up how to compute the number of uses.

I'll test the following.

Richard.

>From 12f7ce0b2a7b12eeadffee6049d733d46ef484dd Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Wed, 27 Aug 2025 14:40:37 +0200
Subject: [PATCH] Avoid mult pattern if that will break reduction constraints
To: gcc-patches@gcc.gnu.org

synth-mult introduces multiple uses of a reduction variable
in some cases which will ultimatively fail vectorization (or ICE
with a pending change).  So avoid applying the pattern in such case.

        * tree-vect-patterns.cc (vect_synth_mult_by_constant): Avoid
        in cases that introduce multiple uses of reduction operands.

Co-authored-by: Jakub Jelinek <ja...@redhat.com>
---
 gcc/tree-vect-patterns.cc | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 3fffcac4b3a..d5c81be6e9d 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -4303,6 +4303,8 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op, 
tree val,
   /* Targets that don't support vector shifts but support vector additions
      can synthesize shifts that way.  */
   bool synth_shift_p = !vect_supportable_shift (vinfo, LSHIFT_EXPR, multtype);
+  if (synth_shift_p)
+    return NULL;
 
   HOST_WIDE_INT hwval = tree_to_shwi (val);
   /* Use MAX_COST here as we don't want to limit the sequence on rtx costs.
@@ -4314,6 +4316,35 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op, 
tree val,
   if (!possible)
     return NULL;
 
+  if (vect_is_reduction (stmt_vinfo))
+    {
+      int op_uses = alg.op[0] != alg_zero;
+      for (int i = 1; i < alg.ops; i++)
+       switch (alg.op[i])
+         {
+         case alg_add_t_m2:
+         case alg_sub_t_m2:
+           if (synth_shift_p && alg.log[i])
+             return NULL;
+           else
+             op_uses++;
+           break;
+         case alg_add_t2_m:
+         case alg_sub_t2_m:
+           op_uses++;
+           break;
+         default:
+           break;
+         }
+      if (variant == add_variant)
+       op_uses++;
+      /* When we'll synthesize more than a single use of the reduction
+        operand the reduction constraints are violated.  Avoid this
+        situation.  */
+      if (op_uses > 1)
+       return NULL;
+    }
+
   if (!target_supports_mult_synth_alg (&alg, variant, vectype, synth_shift_p))
     return NULL;
 
-- 
2.43.0

Reply via email to