[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2021-08-10 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Andrew Pinski  changed:

   What|Removed |Added

 CC||zamazan4ik at tut dot by

--- Comment #7 from Andrew Pinski  ---
*** Bug 91250 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-13 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:c0c39a765b0714aed36fced6fbba452a6619acb0

commit r11-350-gc0c39a765b0714aed36fced6fbba452a6619acb0
Author: Jakub Jelinek 
Date:   Wed May 13 11:21:02 2020 +0200

Fold single imm use of a FMA if it is a negation [PR95060]

match.pd already has simplifications for negation of a FMA (FMS, FNMA,
FNMS)
call if it is single use, but when the widening_mul pass discovers FMAs,
nothing folds the statements anymore.

So, the following patch adjusts the widening_mul pass to handle that.

I had to adjust quite a lot of tests, because they have in them nested FMAs
(one FMA feeding another one) and the patch results in some (equivalent)
changes
in the chosen instructions, previously the negation of one FMA's result
would result in the dependent FMA being adjusted for the negation, but now
instead the first FMA is adjusted.

2020-05-13  Jakub Jelinek  

PR tree-optimization/95060
* tree-ssa-math-opts.c (convert_mult_to_fma_1): Fold a NEGATE_EXPR
if it is the single use of the FMA internal builtin.

* gcc.target/i386/avx512f-pr95060.c: New test.
* gcc.target/i386/fma_double_1.c: Adjust expected insn counts.
* gcc.target/i386/fma_double_2.c: Likewise.
* gcc.target/i386/fma_double_3.c: Likewise.
* gcc.target/i386/fma_double_4.c: Likewise.
* gcc.target/i386/fma_double_5.c: Likewise.
* gcc.target/i386/fma_double_6.c: Likewise.
* gcc.target/i386/fma_float_1.c: Likewise.
* gcc.target/i386/fma_float_2.c: Likewise.
* gcc.target/i386/fma_float_3.c: Likewise.
* gcc.target/i386/fma_float_4.c: Likewise.
* gcc.target/i386/fma_float_5.c: Likewise.
* gcc.target/i386/fma_float_6.c: Likewise.
* gcc.target/i386/l_fma_double_1.c: Likewise.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Created attachment 48515
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48515&action=edit
gcc11-pr95060.patch

Untested fix.

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Richard Biener  changed:

   What|Removed |Added

Version|unknown |11.0
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-05-12
   Keywords||missed-optimization
 Ever confirmed|0   |1
 Target||x86_64-*-* i?86-*-*

--- Comment #3 from Richard Biener  ---
FMA generation already folds the FMA stmt:

  if (cond)
fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
   op2, addop, else_value);
  else
fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
  gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
  gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
   use_stmt));
  gsi_replace (&gsi, fma_stmt, true);
  /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
 regardless of where the negation occurs.  */
  gimple *orig_stmt = gsi_stmt (gsi);
  if (fold_stmt (&gsi, follow_all_ssa_edges))
{
  if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
gcc_unreachable ();
  update_stmt (gsi_stmt (gsi));

but not the negate it feeds since with -ffast-math we have
-((a[i] * b[i]) + c[i]) as canonical form it seems (reassoc does this).

float r[8], a[8], b[8], c[8];

void
test_fnms (void)
{
  for (int i = 0; i < 8; i++)
r[i] = -((a[i] * b[i]) + c[i]);
}

would be an alternative testcase, not handled without -ffast-math either.

I'd suggest to fold the single-use stmt of the fma_stmts lhs if any
[and if it is a negate].

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I see the needed simplifiers in match.pd:
  (simplify
   (negate (fmas@3 @0 @1 @2))
   (if (single_use (@3))
(IFN_FNMS @0 @1 @2
but perhaps the problem is that there is no forwprop after widening_mul that
would perform that optimization?
So, shall widening_mul itself if it matches some FMA check if the result of
IFN_{FMA,FMS,FNMA,FNMS} it created isn't negation and if yes, try to
gimple_fold it?

[Bug tree-optimization/95060] vfnmsub132ps is not generated with -ffast-math

2020-05-11 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95060

Uroš Bizjak  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Uroš Bizjak  ---
Related to PR86999.