https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122749
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
A few other examples (though not regressions):
```
#include <arm_sve.h>
svint32_t f0(svint32_t a, svint32_t b, svint32_t c)
{
return (a * b) + c;
}
svint32_t f(svuint32_t a, svuint32_t b, svint32_t c)
{
return (svint32_t)(a * b) + c;
}
svint32_t f1(svint32_t a, svint32_t b, svint32_t c)
{
svuint32_t aa = (svuint32_t)a;
svuint32_t bb = (svuint32_t)b;
return (svint32_t)(aa * bb) + c;
}
svuint32_t f2(svint32_t a, svint32_t b, svuint32_t c)
{
return (svuint32_t)(a * b) + c;
}
```
f0 works as expected.
Note the above is about FMA but COND_FMA is similar.
Though I wonder for the unconditional fma we could not just have the pattern
which combine/fwprop tries:
(set (reg:VNx4SI 127 [ vect_x_12.16 ])
(plus:VNx4SI (mult:VNx4SI (reg:VNx4SI 174 [ vect__4.12_45 ])
(reg:VNx4SI 119 [ vect_vec_iv_.13 ]))
(reg:VNx4SI 127 [ vect_x_12.16 ])))
fwprop does try it too:
```
propagating insn 8 into insn 9, replacing:
(set (reg:VNx4SI 107 [ _6 ])
(plus:VNx4SI (reg:VNx4SI 108 [ _1 ])
(reg/v:VNx4SI 106 [ cD.14130 ])))
failed to match this instruction:
(set (reg:VNx4SI 107 [ _6 ])
(plus:VNx4SI (mult:VNx4SI (reg/v:VNx4SI 104 [ aD.14128 ])
(reg/v:VNx4SI 105 [ bD.14129 ]))
(reg/v:VNx4SI 106 [ cD.14130 ])))
```
With the extra clobber for the scratch and do a split.
This will at least give us the FMA but the it will use always true predicate
instead of the predicate of the loop which might be ok ...