Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7

Richard Biener Mon, 24 Nov 2025 06:18:04 -0800

On Mon, Nov 24, 2025 at 3:01 PM Li, Pan2 <[email protected]> wrote:
>
> Hi Richard,
>
> The pre-refactor of this series completed, do you prefer to continue the v2 
> now or defer to the next stage 1?
> Both works for me, and thanks a lot.


You can send v2 now.

Richard.

> Pan
>
> -----Original Message-----
> From: Li, Pan2 <[email protected]>
> Sent: Saturday, November 1, 2025 10:33 AM
> To: Richard Biener <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> Liu, Hongtao <[email protected]>; [email protected]
> Subject: RE: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
>
> Thanks Richard for comments.
>
> > Thanks and sorry for the delay...
>
> Never mind, I see you were busy for middle-end vectorization changes recently.
>
> > OK, so this is basically widen + mult vs. pre-widen + widen_mult vs. 
> > widen_mult.
> > IMO this would be perfect for a widening_mult helper?
>
> Sure thing, seems also benefits other SAT_MUL pattern(s) introduced in 
> previous too. So I may try to refactor that first.
>
> > I see.   That's odd.  In fact I'd say we'd want to transform
> > (nop_convert (bit_{ior,and,xor}:c (convert @0) @1))
> > to move the conversion inside like
>
> > (for op (bit_ior bit_and bit_xor)
> >  (simplify
> >   (nop_convert (op:c (convert @0) @1))
> >   (op:type (convert @0) (convert @1))))
>
> That make much sense to me, and we have similar scenario for the SAT_MUL 
> pattern in previous, will refactor this first.
>
> Pan
>
> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: Friday, October 31, 2025 9:37 PM
> To: Li, Pan2 <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> Liu, Hongtao <[email protected]>; [email protected]
> Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
>
> On Tue, Oct 28, 2025 at 3:31 AM Li, Pan2 <[email protected]> wrote:
> >
> > Thanks Richard for comments.
> >
> > > (match
> > > (widening_mult @0 @1)
> > > (widen_mult @0 @1))
> > > (match
> > > (widening_mult @0 @1)
> > > (mult (convert @0) (convert @1))
> > >  (if (it is widening))
> >
> > I see, that wrap the convert part into another helper predicate. That make 
> > sense to me, and will have a try soon.
> > I think the difference comes from the the uint128_t/uint64_t has different 
> > gimple compares to the rest.
> >
> > 1. WT uint128_t, NT uint64_t, we have gimple as below.
> >
> >   8   │   uint128_t x;
> > 19   │   x_8 = a_6(D) w* b_7(D);
> >
> > The NT (aka uint64_t) doesn't need to convert when widen_mul to uint128_t.
> >
> > 2. WT uint128_t, NT uint32_t, as well as the rest combination if WT > NT.
> >
> >     8   │   uint128_t x;
> >   21   │   _12 = (unsigned long) a_6(D);
> >   22   │   _13 = (unsigned long) b_7(D);
> >   23   │   x_8 = _12 w* _13;
> >
> > The NT (aka uint32_t) need to convert to uint64_t before widen_mul to 
> > uint128_t.
> >
> > In previous, I found it is not easy to merge the 2 case into one expression.
> > Thus, add one pattern for case 1 and the other for case 2.
>
> OK, so this is basically widen + mult vs. pre-widen + widen_mult vs. 
> widen_mult.
> IMO this would be perfect for a widening_mult helper?
>
> > > Now for the question - part of the pattern is
> > >  (convert? (bit_ior (negate (convert from-bool)) (convert (widening_mult 
> > > ...)))
> > > I wonder whether you've seen the outer conversion being moved inside the 
> > > IOR?
> > > At least I wonder why there's the (convert ...) around the widening_mult 
> > > and
> > > why that's not conditional?  The textual description of the pattern 
> > > mentions
> > > (NT)x | (NT)overflow_p, so the result should already be in NT, no
> > > outer conversion
> > > required?  That said, (NT)((WT)x | (WT)overflow_p) would be equally valid,
> > > but we fold that inside the IOR?  But not always?
> >
> > Some of the cases introduces signed type for some intermediate result, like 
> > WT is uint16_t, NT is uint8_t.
> >
> >   11   │   signed char _3;            // we have signed char from _3 to _6.
> >   12   │   signed char _4;
> >   13   │   signed char _5;
> >   14   │   signed char _6;
> >   15   │   uint8_t _11;
> >   16   │
> >   17   │   <bb 2> [local count: 1073741824]:
> >   18   │   _1 = (short unsigned int) a_7(D);
> >   19   │   _2 = (short unsigned int) b_8(D);
> >   20   │   x_9 = _1 * _2;
> >   21   │   overflow_p_10 = x_9 > 255;
> >   22   │   _3 = (signed char) overflow_p_10;
> >   23   │   _4 = -_3;
> >   24   │   _5 = (signed char) x_9;
> >   25   │   _6 = _4 | _5;
> >   26   │   _11 = (uint8_t) _6;
>
> I see.   That's odd.  In fact I'd say we'd want to transform
> (nop_convert (bit_{ior,and,xor}:c (convert @0) @1))
> to move the conversion inside like
>
> (for op (bit_ior bit_and bit_xor)
>  (simplify
>   (nop_convert (op:c (convert @0) @1))
>   (op:type (convert @0) (convert @1))))
>
> >   27   │   return _11;
> >
> > When we take a look for WT is uint64_t, NT is uint32_t, we will have 
> > unsigned.
> >
> >   11   │   unsigned int _3;
> >   12   │   unsigned int _4;
> >   13   │   unsigned int _5;
> > ...
> >   26   │   _6 = _4 | _5;
> >
> > The outer convert with mark '?' is introduced to match both of the above 2 
> > cases.
>
> Thanks and sorry for the delay...
> Richard.
>
> > Pan
> >
> >
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: Monday, October 27, 2025 9:14 PM
> > To: Li, Pan2 <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected]; 
> > [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> > Liu, Hongtao <[email protected]>; [email protected]
> > Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
> >
> > On Sun, Oct 26, 2025 at 11:58 AM Li, Pan2 <[email protected]> wrote:
> > >
> > > Hi Richard,
> > >
> > > I had a try for 2 parts those days.
> > >
> > > 1. Remove outer convert?, it will make the pattern failed to match as the 
> > > type is signed
> > >      without that convert, and the types_match (type, @0, @1) to be false.
> > > 2.  Merge 2 into one pattern with (convert?@4 @0) but it will also result 
> > > in
> > >      pattern match fail(when NT is uint32_t and WT is uint64_t).
> > >      Because the types_match (type, capture[0], capture[1]) will be 
> > > false, as type is SI and capture is DI.
> > >
> > > So I think it is possible to separate the common part into a helper 
> > > pattern like unsigned_sat_mul_helper.
> > > And make the real unsigned_integer_sat_mul(@0, @1) for the first one, and 
> > > unsigned_integer_sat_mul((convert @0), (convert @1)
> > > for the second one.
> > > Then we don't need to duplicate most of the patterns up to a point. How 
> > > do you think of it? Thanks a lot.
> >
> > Maybe we can first get one of the patterns to a form I understand (the
> > one with the many conversions)?
> > Splitting out parts might only obfuscate things.  It would be maybe
> > nice to be able to merge different
> > alternatives within the same (match ..) sharing both the name and the
> > conditions, like
> >
> > (match (foo @0 @1)
> >  (convert (mult @0 @1))
> >  (widen_mult @0 @1)
> >  (if (...)))
> >
> > If you can structure the two patterns in a way that the (if ...) part
> > is identical that would help.
> >
> > Some question:
> >
> > +   (convert?
> > +    (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
> > +            (convert (mult_op:c@3 (convert@4 @0) (convert@5 @1)))))
> >
> > So I get that we want a widening multiplication but that's either a 
> > widen_mult
> > already or not.  We might want to have a
> >
> > (match
> >  (widening_mult @0 @1)
> >  (widen_mult @0 @1))
> > (match
> >  (widening_mult @0 @1)
> >  (mult (convert @0) (convert @1))
> >  (if (it is widening))
> >
> > to help merging.  Now for the question - part of the pattern is
> >
> >   (convert? (bit_ior (negate (convert from-bool)) (convert (widening_mult 
> > ...)))
> >
> > I wonder whether you've seen the outer conversion being moved inside the 
> > IOR?
> > At least I wonder why there's the (convert ...) around the widening_mult and
> > why that's not conditional?  The textual description of the pattern mentions
> > (NT)x | (NT)overflow_p, so the result should already be in NT, no
> > outer conversion
> > required?  That said, (NT)((WT)x | (WT)overflow_p) would be equally valid,
> > but we fold that inside the IOR?  But not always?
> >
> > Thanks,
> > Richard.
> >
> > > Pan
> > >
> > > -----Original Message-----
> > > From: Li, Pan2
> > > Sent: Friday, October 24, 2025 9:52 AM
> > > To: 'Richard Biener' <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected]; 
> > > [email protected]; [email protected]; Chen, Ken 
> > > <[email protected]>; Liu, Hongtao <[email protected]>; 
> > > [email protected]
> > > Subject: RE: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
> > >
> > > Thanks Richard for comments.
> > >
> > > Assume we talk about the form 7 as below:
> > >
> > > #define DEF_SAT_U_MUL_FMT_7(NT, WT)             \
> > > NT __attribute__((noinline))                    \
> > > sat_u_mul_##NT##_from_##WT##_fmt_7 (NT a, NT b) \
> > > {                                               \
> > >   WT x = (WT)a * (WT)b;                         \
> > >   NT max = -1;                                  \
> > >   bool overflow_p = x > (WT)(max);              \
> > >   return -(NT)(overflow_p) | (NT)x;             \
> > > }
> > >
> > > > So why is matching the conversion necessary at all, that is, why is 
> > > > (mult @0 @1)
> > > > not sufficient here?
> > >
> > > Because there are many different types combinations, unsigned SAT_MUL 
> > > need the help of a wider type.
> > > For example:
> > >
> > > If NT is uint32_t, WT is uint128_t, we need the convert before widen_mul:
> > >   21   │   _12 = (unsigned long) a_6(D);
> > >   22   │   _13 = (unsigned long) b_7(D);
> > >   23   │   x_8 = _12 w* _13;
> > >
> > > If NT is uint64_t, WT is uint128_t, we don't have the convert.
> > >   19   │   x_8 = a_6(D) w* b_7(D);
> > >
> > > > You are not looking at the types of @0 or @1 at
> > > > all, AFAICS
> > > > one could be 'char' and one could be 'short'?
> > >
> > > The type is restricted by (if (types_match (type, @0, @1)). Aka the 
> > > result must have the same type as @0 and @1.
> > >
> > > > Captures on optional nodes are "merged" to the operand, so for 
> > > > (convert?@4 @0)
> > > > when there is no conversion @4 == @0.  That is it behaves as if there 
> > > > were two
> > > > captures at the place of @0.
> > >
> > > I see, sounds like alias here, thanks for the explanation.
> > >
> > > > But isn't _4 | _5 still a sat_u_mul-8-u8 but with a conversion to the
> > > > desired type?
> > > > That is, I wonder if even when the conversion is missing, the
> > > > saturated operation
> > > > can be matched by using (original-type)SAT_UXYZ (...), implying the 
> > > > SAT_UXYZ
> > > > produces a uint8_t (in this case)?  Not sure I phrased this in an
> > > > understandable way ;)
> > >
> > > > I'll note that the outer conversion is poorly constrained given the 
> > > > conversion
> > > > around the multiplication/inside the negate is what determines the
> > > > semantics of it
> > > > and that's not constrained either?
> > >
> > > For NT is uint8_t and WT is uint128_t, we have the final convert from 
> > > gimple. The bigger types mentioned in above
> > > don't have such convert.  The _11 and _6 should be almost the same, maybe 
> > > nop_convert here is good enough if it failed
> > > to match that.  I will have a try and keep you posted.
> > >
> > >   14   │   signed char _6;
> > >   15   │   uint8_t _11;
> > >  ...
> > >   26   │   _3 = (signed char) overflow_p_10;
> > >   27   │   _4 = -_3;
> > >   28   │   _5 = (signed char) x_9;
> > >   29   │   _6 = _4 | _5;
> > >   30   │   _11 = (uint8_t) _6;
> > >   31   │   return _11;
> > >
> > > Pan
> > >
> > >
> > > -----Original Message-----
> > > From: Richard Biener <[email protected]>
> > > Sent: Thursday, October 23, 2025 7:08 PM
> > > To: Li, Pan2 <[email protected]>
> > > Cc: [email protected]; [email protected]; [email protected]; 
> > > [email protected]; [email protected]; Chen, Ken 
> > > <[email protected]>; Liu, Hongtao <[email protected]>; 
> > > [email protected]
> > > Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
> > >
> > > On Wed, Oct 22, 2025 at 2:21 PM Li, Pan2 <[email protected]> wrote:
> > > >
> > > > Thanks Richard for comments.
> > > >
> > > > > I'm a bit confused as to this for and the explicit widen_mult case in 
> > > > > the 2nd
> > > > > pattern below.  In fact, the patterns look similar enough to me and I 
> > > > > wonder
> > > > > whether the consumer can handle the outer conversion and the 
> > > > > conversion
> > > > > of the leafs of the multiplication?  The conversion of the leafs 
> > > > > doesn't have
> > > > > any constraints, nor is it obvious why there's an outer conversion 
> > > > > around the
> > > > > IOR.
> > > >
> > > > The two pattern looks similar up to a point. I separate them into 2 
> > > > mostly comes from
> > > > the inner convert about mult, aka (mult (convert@4 @0) (convert@5 @1)). 
> > > > Some diff
> > > > type combination need to convert while others are not.
> > > >
> > > > I tried to leverage something like (mult (convert?@4 @0) (convert?@5 
> > > > @1)) to put
> > > > them together. But it may has additional sematics because add '?' after 
> > > > convert will
> > > > cover (mult (convert@4 @0) @1) which is not expected.
> > >
> > > So why is matching the conversion necessary at all, that is, why is (mult 
> > > @0 @1)
> > > not sufficient here?  You are not looking at the types of @0 or @1 at
> > > all, AFAICS
> > > one could be 'char' and one could be 'short'?
> > >
> > > > Another problem about add '?' after convert is capture @4/@5, if there 
> > > > is no need
> > > > to convert, I am not sure how to take care of the @4 from the “with“” 
> > > > scope because it
> > > > is optional. The code acts on capture @4/@5 is not correct when there 
> > > > is no convert.
> > >
> > > Captures on optional nodes are "merged" to the operand, so for 
> > > (convert?@4 @0)
> > > when there is no conversion @4 == @0.  That is it behaves as if there 
> > > were two
> > > captures at the place of @0.
> > >
> > > > I may have another try to merge them into one, but is there any 
> > > > suggestion here?
> > > >
> > > > > outer conversion around the
> > > > > IOR.
> > > >
> > > > Some type(s) may need to do a convert to the final type, like 
> > > > sat_u_mul-8-u8-from-u32,
> > > > The seq may look like blow.
> > > >
> > > >   25   │   _6 = _4 | _5;
> > > >   26   │   _11 = (uint8_t) _6;
> > > >   27   │   return _11;
> > >
> > > But isn't _4 | _5 still a sat_u_mul-8-u8 but with a conversion to the
> > > desired type?
> > > That is, I wonder if even when the conversion is missing, the
> > > saturated operation
> > > can be matched by using (original-type)SAT_UXYZ (...), implying the 
> > > SAT_UXYZ
> > > produces a uint8_t (in this case)?  Not sure I phrased this in an
> > > understandable way ;)
> > >
> > > I'll note that the outer conversion is poorly constrained given the 
> > > conversion
> > > around the multiplication/inside the negate is what determines the
> > > semantics of it
> > > and that's not constrained either?
> > >
> > > Richard.
> > >
> > > > Pan
> > > >
> > > > -----Original Message-----
> > > > From: Richard Biener <[email protected]>
> > > > Sent: Wednesday, October 22, 2025 5:00 PM
> > > > To: Li, Pan2 <[email protected]>
> > > > Cc: [email protected]; [email protected]; 
> > > > [email protected]; [email protected]; [email protected]; Chen, 
> > > > Ken <[email protected]>; Liu, Hongtao <[email protected]>; 
> > > > [email protected]
> > > > Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
> > > >
> > > > On Mon, Oct 20, 2025 at 3:10 PM <[email protected]> wrote:
> > > > >
> > > > > From: Pan Li <[email protected]>
> > > > >
> > > > > This patch would like to try to match the the unsigned
> > > > > SAT_MUL form 7, aka below:
> > > > >
> > > > >   #define DEF_SAT_U_MUL_FMT_7(NT, WT)             \
> > > > >   NT __attribute__((noinline))                    \
> > > > >   sat_u_mul_##NT##_from_##WT##_fmt_7 (NT a, NT b) \
> > > > >   {                                               \
> > > > >     WT x = (WT)a * (WT)b;                         \
> > > > >     NT max = -1;                                  \
> > > > >     bool overflow_p = x > (WT)(max);              \
> > > > >     return -(NT)(overflow_p) | (NT)x;             \
> > > > >   }
> > > > >
> > > > > while WT is uint128_t, uint64_t, uint32_t and uint16_t, and
> > > > > NT is uint64_t, uint32_t, uint16_t or uint8_t.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         * match.pd: Add pattern for SAT_MUL form 5 include
> > > > >         mul and widen_mul.
> > > > >
> > > > > Signed-off-by: Pan Li <[email protected]>
> > > > > ---
> > > > >  gcc/match.pd | 35 +++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 35 insertions(+)
> > > > >
> > > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > > index bfc51e6579a..0f55a82d989 100644
> > > > > --- a/gcc/match.pd
> > > > > +++ b/gcc/match.pd
> > > > > @@ -3749,6 +3749,41 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > >        bool widen_mult_p = prec * 2 == widen_prec;
> > > > >       }
> > > > >       (if (c2_is_type_precision_p && widen_mult_p)))))
> > > > > + (for mult_op (mult widen_mult)
> > > >
> > > > I'm a bit confused as to this for and the explicit widen_mult case in 
> > > > the 2nd
> > > > pattern below.  In fact, the patterns look similar enough to me and I 
> > > > wonder
> > > > whether the consumer can handle the outer conversion and the conversion
> > > > of the leafs of the multiplication?  The conversion of the leafs 
> > > > doesn't have
> > > > any constraints, nor is it obvious why there's an outer conversion 
> > > > around the
> > > > IOR.
> > > >
> > > > > +  (match (unsigned_integer_sat_mul @0 @1)
> > > > > +   (convert?
> > > > > +    (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
> > > > > +            (convert (mult_op:c@3 (convert@4 @0) (convert@5 @1)))))
> > > > > +   (if (types_match (type, @0, @1))
> > > > > +    (with
> > > > > +     {
> > > > > +      unsigned prec = TYPE_PRECISION (type);
> > > > > +      unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> > > > > +      unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4));
> > > > > +      unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5));
> > > > > +
> > > > > +      wide_int max = wi::mask (prec, false, widen_prec);
> > > > > +      bool c2_is_max_p = wi::eq_p (wi::to_wide (@2), max);
> > > > > +      bool widen_mult_p = mult_op == WIDEN_MULT_EXPR && cvt4_prec == 
> > > > > cvt5_prec
> > > > > +       && widen_prec == cvt5_prec * 2;
> > > > > +      bool mult_p = mult_op == MULT_EXPR && cvt4_prec == cvt5_prec
> > > > > +       && cvt4_prec == widen_prec && widen_prec > prec;
> > > > > +     }
> > > > > +     (if (c2_is_max_p && (mult_p || widen_mult_p)))))))
> > > > > + (match (unsigned_integer_sat_mul @0 @1)
> > > > > +  (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
> > > > > +          (convert (widen_mult:c@3 @0 @1)))
> > > > > +  (if (types_match (type, @0, @1))
> > > > > +   (with
> > > > > +    {
> > > > > +     unsigned prec = TYPE_PRECISION (type);
> > > > > +     unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> > > > > +
> > > > > +     wide_int max = wi::mask (prec, false, widen_prec);
> > > > > +     bool c2_is_max_p = wi::eq_p (wi::to_wide (@2), max);
> > > > > +     bool widen_mult_p = prec * 2 == widen_prec;
> > > > > +    }
> > > > > +    (if (c2_is_max_p && widen_mult_p)))))
> > > > >  )
> > > > >
> > > > >  /* The boundary condition for case 10: IMM = 1:
> > > > > --
> > > > > 2.43.0
> > > > >

Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7

Reply via email to