On Tue, 3 Feb 2026, Tamar Christina wrote:
> If we defer an FMA creation the code tries to determine the order of the
> operands before deferring. To do this it compares the operands against the
> result expression (which should contain the multiplication expression).
>
> However the multiply might be wrapped in a conversion. This change has us
> strip
> one level of conversion (the most that convert_mult_to_fma) supports handling
> and only then do the comparison.
>
> We cannot strip ops[0] and ops[1] and store them stripped since after the
> deferrence, if we create an FMA we need to know the original types and
> convert_mult_to_fma handles the conversions during FMA creation anyway.
>
> There's probably a similar helper to strip_nop_view_converts but I couldn't
> find one, since many of the stripping helpers are recursive or don't support
> stripping VIEW_CONVERTS.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR tree-optimization/123897
> * tree-ssa-math-opts.cc (strip_nop_view_converts): New.
> (convert_mult_to_fma): Use it.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/123897
> * gcc.target/aarch64/sve/pr123898.c: New test.
>
> ---
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> b/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..a5741d5058e81b8c34bcfacca5f456bc015308a1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr123898.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 --param avoid-fma-max-bits=512
> -march=armv9-a -msve-vector-bits=256 -fdump-tree-widening_mul" } */
> +
> +typedef __attribute__((__vector_size__(32))) char A;
> +typedef __attribute__((__vector_size__(32))) signed char D;
> +
> +A c;
> +char x;
> +
> +void
> +foo(D d)
> +{
> + d *= x;
> + c += (A)d;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "\.FMA" 1 "widening_mul" } } */
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index
> b1fa38a832af889972913553d75305a9420b7b53..8990fa232bac304d09f170464b0b67ec62ee0b81
> 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -118,6 +118,7 @@ along with GCC; see the file COPYING3. If not see
> #include "tree-ssa-math-opts.h"
> #include "dbgcnt.h"
> #include "cfghooks.h"
> +#include "gimple-match.h"
>
> /* This structure represents one basic block that either computes a
> division, or is a common dominator for basic block that compute a
> @@ -3353,6 +3354,26 @@ last_fma_candidate_feeds_initial_phi
> (fma_deferring_state *state,
> return false;
> }
>
> +/* If ARG is a convert that only changes the sign then strip the outer
> + conversion away. It does not strip conversions recursively. Otherwise
> + return ARG. */
> +
> +static tree
> +strip_nop_view_converts (tree arg)
> +{
> + if (TREE_CODE (arg) != SSA_NAME)
> + return arg;
> +
> + gimple *assign = SSA_NAME_DEF_STMT (arg);
> + gimple_match_op res_op;
> + if (gimple_assign_cast_p (assign)
> + && gimple_extract_op (assign, &res_op)
Sorry to be nit-picky, but I think you should elide gimple_assign_cast_p
here and after gimple_extract_op check
(CONVERT_EXPR_P (res_op.op) || res_op.op == VIEW_CONVERT_EXPR)
I'll note that gimple_assign_cast_p also allows FIX_TRUNC_EXPR
(but not FLOAT_EXPR). So IMO it's one of the "worst" predicates
we have.
OK with the above change.
Richard.
> + && tree_nop_conversion_p (TREE_TYPE (res_op.ops[0]), TREE_TYPE (arg)))
> + return res_op.ops[0];
> +
> + return arg;
> +}
> +
> /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2
> with uses in additions and subtractions to form fused multiply-add
> operations. Returns true if successful and MUL_STMT should be removed.
> @@ -3613,11 +3634,11 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree
> op2,
> {
> gcc_checking_assert (!state->m_initial_phi);
> gphi *phi;
> - if (ops[0] == result)
> + if (strip_nop_view_converts (ops[0]) == result)
> phi = result_of_phi (ops[1]);
> else
> {
> - gcc_assert (ops[1] == result);
> + gcc_assert (strip_nop_view_converts (ops[1]) == result);
> phi = result_of_phi (ops[0]);
> }
>
>
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)