Thanks Richard. >> + (convert (widen_mult:c@3 (convert@4 (convert @0)) >> So I added 2 times convert to match above sematics. Do you indicate that we >> only need one convert here? > Yes, the above extra widening looks odd to me.
Remove the inner convert of (convert @0)) will make the pattern fail to match. The after_dom_children will meet MUL_EXPR first and then the convert_mult_to_widen will introduce the convert, as well as wide_mult. The 220t.fab1 doesn't have wide_mult, so I think we need to move the sat_mul before widening_mul pass to get ride of the Inner convert up to a point. 20 │ _1 = (__int128 unsigned) a_8(D); 21 │ _2 = (__int128 unsigned) b_9(D); 22 │ x_10 = _1 * _2; 23 │ _3 = x_10 >> 16; 24 │ hi_11 = (uint16_t) _3; 25 │ _4 = hi_11 != 0; 26 │ _14 = (signed short) _4; 27 │ _5 = -_14; 28 │ lo.0_6 = (signed short) x_10; 29 │ _7 = _5 | lo.0_6; 30 │ _12 = (uint16_t) _7; The 2 times covert will be present from 272t.optmized two. Pan -----Original Message----- From: Richard Biener <richard.guent...@gmail.com> Sent: Tuesday, September 2, 2025 10:01 PM To: Li, Pan2 <pan2...@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp....@gmail.com; Chen, Ken <ken.c...@intel.com>; Liu, Hongtao <hongtao....@intel.com> Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul On Tue, Sep 2, 2025 at 3:35 PM Li, Pan2 <pan2...@intel.com> wrote: > > Thanks Richard for comments. > > > Are the inner (convert ..) around @0 and @1 necessary? The whole > > thing as written seems to match (long)(short)x * (long)(unsigned int)y > > for 'char' x and y? > > The gimple after wide-mul has 2 times convert similar as below. > > 21 <bb 2> [local count: 1073741824]: > 22 _1 = (__int128 unsigned) a_8(D); > 23 _2 = (__int128 unsigned) b_9(D); > 24 _34 = (unsigned long) _1; > 25 _33 = (unsigned long) _2; > 26 x_10 = _34 w* _33; > 27 _3 = x_10 >> 64; > 28 hi_11 = (uint64_t) _3; > 29 lo_12 = (uint64_t) x_10; > > + (convert (widen_mult:c@3 (convert@4 (convert @0)) > > So I added 2 times convert to match above sematics. Do you indicate that we > only need one convert here? Yes, the above extra widening looks odd to me. > > I also wonder if there isn't any TYPE_UNSIGNED check > > missing below given the comment mentions unsigned WT. > > The outer checked the type with TYPE_UNSIGNED, and inner checked the > types_match (type, @0, @1)). > So I think this ensure it only works for unsigned. Ah, OK. > 3184 /* Saturation add for unsigned integer. */ > 3185 (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)) > > Pan > > -----Original Message----- > From: Richard Biener <richard.guent...@gmail.com> > Sent: Tuesday, September 2, 2025 8:31 PM > To: Li, Pan2 <pan2...@intel.com> > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp....@gmail.com; Chen, Ken <ken.c...@intel.com>; > Liu, Hongtao <hongtao....@intel.com> > Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for > widen-mul > > On Mon, Sep 1, 2025 at 6:57 AM <pan2...@intel.com> wrote: > > > > From: Pan Li <pan2...@intel.com> > > > > This patch would like to try to match the the unsigned > > SAT_MUL form 4, aka below: > > > > #define DEF_SAT_U_MUL_FMT_5(NT, WT) \ > > NT __attribute__((noinline)) \ > > sat_u_mul_##NT##_from_##WT##_fmt_5 (NT a, NT b) \ > > { \ > > WT x = (WT)a * (WT)b; \ > > NT hi = x >> (sizeof(NT) * 8); \ > > NT lo = (NT)x; \ > > return lo | -!!hi; \ > > } > > > > while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or uint64_t. > > > > gcc/ChangeLog: > > > > * match.pd: Add pattern for SAT_MUL form 5. > > * tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children): > > Try match pattern for IOR. > > > > Signed-off-by: Pan Li <pan2...@intel.com> > > --- > > gcc/match.pd | 25 +++++++++++++++++++++++++ > > gcc/tree-ssa-math-opts.cc | 1 + > > 2 files changed, 26 insertions(+) > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index b1d7a3a1b73..4a3587017d1 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3660,6 +3660,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > bool c2_is_type_precision_p = c2 == prec; > > } > > (if (widen_prec > prec && c2_is_type_precision_p && c4_is_max_p))))) > > + (match (unsigned_integer_sat_mul @0 @1) > > + /* SAT_U_MUL (X, Y) = { > > + WT x = (WT)a * (WT)b; > > + NT hi = x >> (sizeof(NT) * 8); > > + NT lo = (NT)x; > > + return lo | -!!hi; > > + } while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or > > uint64_t. */ > > + (convert1? > > + (bit_ior (convert? (negate (convert (ne (convert (rshift @3 > > INTEGER_CST@2)) > > + integer_zerop)))) > > + (convert (widen_mult:c@3 (convert@4 (convert @0)) > > + (convert@5 (convert @1)))))) > > Are the inner (convert ..) around @0 and @1 necessary? The whole > thing as written seems to match (long)(short)x * (long)(unsigned int)y > for 'char' x and y? I also wonder if there isn't any TYPE_UNSIGNED check > missing below given the comment mentions unsigned WT. > > > + (if (types_match (type, @0, @1)) > > + (with > > + { > > + unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3)); > > + unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4)); > > + unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5)); > > + bool widen_mult_p = cvt4_prec == cvt5_prec && widen_prec == > > cvt5_prec * 2; > > + > > + unsigned c2 = tree_to_uhwi (@2); > > + unsigned prec = TYPE_PRECISION (type); > > + bool c2_is_type_precision_p = c2 == prec; > > + } > > + (if (widen_mult_p && c2_is_type_precision_p))))) > > ) > > > > /* The boundary condition for case 10: IMM = 1: > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index bfad4cf5c99..fcbb752a5e3 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -6509,6 +6509,7 @@ math_opts_dom_walker::after_dom_children (basic_block > > bb) > > break; > > > > case BIT_IOR_EXPR: > > + match_unsigned_saturation_mul (&gsi, as_a<gassign *> (stmt)); > > match_saturation_add_with_assign (&gsi, as_a<gassign *> > > (stmt)); > > match_unsigned_saturation_trunc (&gsi, as_a<gassign *> > > (stmt)); > > /* fall-through */ > > -- > > 2.43.0 > >