> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: 02 February 2026 18:12
> To: Tamar Christina <[email protected]>
> Cc: [email protected]; nd <[email protected]>
> Subject: RE: [PATCH]middle-end: use inner variable when determining
> deferred FMA order [PR123898]
>
> On Mon, 2 Feb 2026, Tamar Christina wrote:
>
> > > > +/* If ARG is a view_convert that only changes the sign then strip the
> outer
> > > > + conversion away. It does not strip conversions recursively.
> > > > Otherwise
> > > > + return ARG. */
> > > > +
> > > > +static tree
> > > > +strip_nop_view_converts (tree arg)
> > > > +{
> > > > + if (TREE_CODE (arg) != SSA_NAME)
> > > > + return arg;
> > > > +
> > > > + gimple *assign = SSA_NAME_DEF_STMT (arg);
> > > > + tree lhs_type, var;
> > > > + if (gimple_assign_cast_p (assign)
> > > > + && (var = gimple_assign_rhs1 (assign))
> > > > + && TREE_CODE (var) != SSA_NAME
> > > > + && (var = TREE_OPERAND (var, 0))
> > >
> > > This only works for V_C_Es, does the same issue not exist for
> > > scalar NOP_EXPR conversions?
> > >
> >
> > It also strips NOP_EXPR since gimple_assign_cast_p checks for
> > CONVERT_EXPR_CODE_P, VIEW_CONVERT_EXPR and FIX_TRUNC_EXPR
> > but we'd do the stripping only if there's no precision or class change.
>
> It does not, since you have TREE_CODE (var) != SSA_NAME but for
> NOP_EXPR this will be a unary gassign, not single (V_C_E is
> tcc_reference).
Ah! Apologies I had forgotten that NOP_EXPR is a unary gassing.
>
> You possibly might want to use gimple_extract_op instead which
> should be able to see through this detail (hopefully, unchecked).
Will try.
Thanks,
Tamar
>
>
> > > > + && (lhs_type = TREE_TYPE (var))
> > >
> > > var_type would be less odd, not sure why you need a temporary here,
> > > either. Indenting? For such purpose I usually write
> > >
> > > && ((lhs_type = TREE_TYPE (var), true)
> > >
> > > to indicate there's not really any non-NULL test necessary.
> > >
> >
> > Ack. Updated patch:
> >
> > --
> >
> > If we defer an FMA creation the code tries to determine the order of the
> > operands before deferring. To do this it compares the operands against the
> > result expression (which should contain the multiplication expression).
> >
> > However the multiply might be wrapped in a conversion. This change has us
> strip
> > one level of conversion (the most that convert_mult_to_fma) supports
> handling
> > and only then do the comparison.
> >
> > We cannot strip ops[0] and ops[1] and store them stripped since after the
> > deferrence, if we create an FMA we need to know the original types and
> > convert_mult_to_fma handles the conversions during FMA creation anyway.
> >
> > There's probably a similar helper to strip_nop_view_converts but I couldn't
> > find one, since many of the stripping helpers are recursive or don't support
> > stripping VIEW_CONVERTS.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/123897
> > * tree-ssa-math-opts.cc (strip_nop_view_converts): New.
> > (convert_mult_to_fma): Use it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/123897
> > * gcc.target/aarch64/sve/pr123898.c: New test.
> > -- inline copy of patch --
> >
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> b/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> > new file mode 100644
> > index
> 0000000000000000000000000000000000000000..a5741d5058e81b8c34
> bcfacca5f456bc015308a1
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-O2 --param avoid-fma-max-bits=512 -
> march=armv9-a -msve-vector-bits=256 -fdump-tree-widening_mul" } */
> > +
> > +typedef __attribute__((__vector_size__(32))) char A;
> > +typedef __attribute__((__vector_size__(32))) signed char D;
> > +
> > +A c;
> > +char x;
> > +
> > +void
> > +foo(D d)
> > +{
> > + d *= x;
> > + c += (A)d;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "\.FMA" 1 "widening_mul" } } */
> > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> > index
> b1fa38a832af889972913553d75305a9420b7b53..b8fcb43ae2a1b0e4e11d
> 055eeae6f5d907e6b7e3 100644
> > --- a/gcc/tree-ssa-math-opts.cc
> > +++ b/gcc/tree-ssa-math-opts.cc
> > @@ -3353,6 +3353,28 @@ last_fma_candidate_feeds_initial_phi
> (fma_deferring_state *state,
> > return false;
> > }
> >
> > +/* If ARG is a convert that only changes the sign then strip the outer
> > + conversion away. It does not strip conversions recursively. Otherwise
> > + return ARG. */
> > +
> > +static tree
> > +strip_nop_view_converts (tree arg)
> > +{
> > + if (TREE_CODE (arg) != SSA_NAME)
> > + return arg;
> > +
> > + gimple *assign = SSA_NAME_DEF_STMT (arg);
> > + tree var;
> > + if (gimple_assign_cast_p (assign)
> > + && (var = gimple_assign_rhs1 (assign))
> > + && TREE_CODE (var) != SSA_NAME
> > + && (var = TREE_OPERAND (var, 0))
> > + && tree_nop_conversion_p (TREE_TYPE (var), TREE_TYPE (arg)))
> > + return var;
> > +
> > + return arg;
> > +}
> > +
> > /* Combine the multiplication at MUL_STMT with operands MULOP1 and
> MULOP2
> > with uses in additions and subtractions to form fused multiply-add
> > operations. Returns true if successful and MUL_STMT should be removed.
> > @@ -3613,11 +3635,11 @@ convert_mult_to_fma (gimple *mul_stmt, tree
> op1, tree op2,
> > {
> > gcc_checking_assert (!state->m_initial_phi);
> > gphi *phi;
> > - if (ops[0] == result)
> > + if (strip_nop_view_converts (ops[0]) == result)
> > phi = result_of_phi (ops[1]);
> > else
> > {
> > - gcc_assert (ops[1] == result);
> > + gcc_assert (strip_nop_view_converts (ops[1]) == result);
> > phi = result_of_phi (ops[0]);
> > }
> >
> >
>
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)