[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #14 from Richard Biener  ---
Should be fixed now.

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:ac68f904fe31baf80fa53218f1d8ee033bd8c79b

commit r13-2113-gac68f904fe31baf80fa53218f1d8ee033bd8c79b
Author: Richard Biener 
Date:   Thu Aug 18 11:10:30 2022 +0200

middle-end/106617 - fix fold_binary_op_with_conditional_arg pattern issue

Now that we have parts of fold_binary_op_with_conditional_arg duplicated
in match.pd and are using ! to take or throw away the result we have to
be careful to not have both implementations play games which each other,
causing quadratic behavior.  In particular the match.pd implementation
requires both arms to simplify while the fold-const.cc is happy with
just one arm simplifying (something we cannot express in match.pd).

The fix is to simply not enable the match.pd pattern for GENERIC.

PR middle-end/106617
* match.pd ((a ? b : c) > d -> a ? (b > d) : (c > d)): Fix
guard, disable on GENERIC to not cause quadratic behavior
with the fold-const.cc implementation and the use of !

* gcc.dg/pr106617.c: New testcase.

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

--- Comment #12 from Richard Biener  ---
Btw, just for reference the original looks like

 fc_cpu_order = ( __builtin_constant_p(( __builtin_constant_p(nr_cpu_ids) ? (
((nr_cpu_ids) == 1) ? 1 : (1UL << (( __builtin_constant_p((nr_cpu_ids) - 1) ? (
__builtin_constant_p((nr_cpu_ids) - 1) ? ( ((nr_cpu_ids) - 1) < 2 ? 0 :
((nr_cpu_ids) - 1) & (1ULL << 63) ? 63 : ((nr_cpu_ids) - 1) & (1ULL << 62) ? 62
: ((nr_cpu_ids) - 1) & (1ULL << 61) ? 61 : ((nr_cpu_ids) - 1) & (1ULL << 60) ?
60 : ((nr_cpu_ids) - 1) & (1ULL << 59) ? 59 : ((nr_cpu_ids) - 1) & (1ULL << 58)
? 58 : ((nr_cpu_ids) - 1) & (1ULL << 57) ? ...

so some fancy ceil_log2.

If you take a simplified

unsigned long fc_setup_exch_mgr(int nr_cpu_ids_m1)
{
  return nr_cpu_ids_m1)) & (1<<21) ? 21
  : ((nr_cpu_ids_m1)) & (1<<20) ? 20
  : ((nr_cpu_ids_m1)) & (1<<19) ? 19
  : ((nr_cpu_ids_m1)) & (1<<18) ? 18
  : ((nr_cpu_ids_m1)) & (1<<17) ? 17
  : ((nr_cpu_ids_m1)) & (1<<16) ? 16
  : ((nr_cpu_ids_m1)) & (1<<15) ? 15
  : ((nr_cpu_ids_m1)) & (1<<14) ? 14
  : ((nr_cpu_ids_m1)) & (1<<13) ? 13
  : ((nr_cpu_ids_m1)) & (1<<12) ? 12
  : ((nr_cpu_ids_m1)) & (1<<11) ? 11  /* THIS */
  : ((nr_cpu_ids_m1)) & (1<<9)  ? 9
  : ((nr_cpu_ids_m1)) & (1<<8)  ? 8
  : ((nr_cpu_ids_m1)) & (1<<7)  ? 7
  : ((nr_cpu_ids_m1)) & (1<<6)  ? 6
  : ((nr_cpu_ids_m1)) & (1<<5)  ? 5
  : ((nr_cpu_ids_m1)) & (1<<4)  ? 4
  : ((nr_cpu_ids_m1)) & (1<<3) ? 3
  : 0)) != 11;
}

I see

./cc1 -quiet t2.c -fdump-tree-original-folding; grep 'Applying'
t2.c.005t.original  | sort -u ; grep 'Applying' t2.c.005t.original  | wc -l
Applying pattern match.pd:4778, generic-match.cc:95676
Applying pattern match.pd:5809, generic-match.cc:24025
16383

but when I just remove the marked line it becomes

./cc1 -quiet t2.c -fdump-tree-original-folding; grep 'Applying'
t2.c.005t.original  | sort -u ; grep 'Applying' t2.c.005t.original  | wc -l
Applying pattern match.pd:4778, generic-match.cc:95676
Applying pattern match.pd:5809, generic-match.cc:24025
34

hmm, and the issue is probably that we have this pattern both in
fold-const.cc and match.pd and thus when the pattern applies recursively,
like for

 (a ? (b ? d : e) : f) > g

and the original fold implementation would simplify because one of the
braches simplifes:

  /* Check that we have simplified at least one of the branches.  */
  if (!TREE_CONSTANT (arg) && !TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs))
return NULL_TREE;

so it goes all the way, recursively to "simple".  But then the match.pd
variant rejects the result because it is EXPR_P for the lhs (but not
for the rhs).  We always first try the match.pd path but then will
try fold_binary_op_with_conditional_arg anyway which makes this effectively
at least quadratic.

This was all done to avoid regressing the COND_EXPR gimple IL refactoring.
One possibility to avoid the recursion with fold-const is to disable the
match.pd pattern for GENERIC.

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-16 Thread pivanov at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

--- Comment #11 from pivanov at hotmail dot com ---
(In reply to Martin Liška from comment #9)
> Started with r13-322-g7f04b0d786e13ff5.

I do confirm that after reverting this commit there is no more hanging when
compiling (bug 106642)

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

Andrew Pinski  changed:

   What|Removed |Added

 CC||pivanov at hotmail dot com

--- Comment #10 from Andrew Pinski  ---
*** Bug 106642 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

[Bug tree-optimization/106617] [13 Regression] gcc is very slow at ternary expressions since r13-322-g7f04b0d786e13ff5

2022-08-16 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106617

Martin Liška  changed:

   What|Removed |Added

Summary|[13 Regression] gcc is very |[13 Regression] gcc is very
   |slow at ternary |slow at ternary expressions
   |expressions,|since
   ||r13-322-g7f04b0d786e13ff5
 CC||marxin at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #9 from Martin Liška  ---
Started with r13-322-g7f04b0d786e13ff5.