[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-06 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org ---
Fixed.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-05 Thread pthaugen at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #16 from Pat Haugen pthaugen at gcc dot gnu.org ---
I tried the patch from Comment 15 and was able to build/run the benchmark
successfully.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-05 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #17 from Jan Hubicka hubicka at gcc dot gnu.org ---
Author: hubicka
Date: Thu Feb  6 07:39:24 2014
New Revision: 207529

URL: http://gcc.gnu.org/viewcvs?rev=207529root=gccview=rev
Log:

PR middle-end/60013
* ipa-inline-analysis.c (compute_bb_predicates): Ensure monotonicity
of the dataflow.
* gcc.dg/pr60013.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/pr60013.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-inline-analysis.c
trunk/gcc/testsuite/ChangeLog


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #12 from Jakub Jelinek jakub at gcc dot gnu.org ---
Slightly more reduced testcase for -O2:
typedef long int jmp_buf[64];
extern int _setjmp (jmp_buf) __attribute__ ((__nothrow__));
struct S { int a, b, c; };
extern struct S *baz (struct S *);
static jmp_buf j;

static inline int
bar (int b, int d)
{
  return (b  d)  0;
}

struct S *
foo (int a, struct S *b, struct S *c, struct S *d)
{
  if (b-a == 0)
{
  switch (a)
{
case 8:
  return baz (b);
case 7:
  bar (b-c, c-b);
  return 0;
case 6:
case 5:
case 4:
  return baz (c);
case 3:
case 2:
  return baz (d);
}
  return 0;
}
  if (b-a == 1)
{
  if (baz (c))
return c;
  else if (_setjmp (j))
baz (b);
}
  return 0;
}


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #13 from Jakub Jelinek jakub at gcc dot gnu.org ---
Ok, here is what happens on the reduced testcase.  We have 9 dynamic conditions
and 14 basic blocks.  Let's use letters for the dynamic conditions:
A (op1[ref offset: 0] == 0)
B (op1[ref offset: 0] != 0)
C (op0 = 2)
D (op0 = 3)
E (op0 = 4)
F (op0 = 6)
G (op0 == 8)
H (op1[ref offset: 0] == 1)
I (op1[ref offset: 0] != 1)

Most of the basic blocks have only a single predecessor and aren't part of any
kind of loop, so they have fixed set of predicates after first iteration, and
bb13 has one pred edge (the default: edge of switch) with true predicate, so it
is always true in the end, so the only really interesting basic blocks are
bb10, bb11 (ABNORMAL_DISPATCHER) and bb12.  The preds of bb10 are bb11 and bb9,
preds of bb11 are bb4, bb5, bb6 and bb12 and pred of bb12 is bb10.  The
predicates for
the bbs that really don't change after first iterations are:
bb4 GA
bb5 FEA
bb6 DCA
bb9 HB
and the problematic basic blocks are always processed in increasing bb-index,
i.e. bb10, bb11, bb12.  So bb10 get's HB.  Then bb11 is processed, bb12 still
has no predicates computed, and the DNF (GA)|(FEA)|(DCA)|(HB) is being
converted into CNF, but there is a cap on the number of s (8), thus we don't
compute the whole CNF (which would need to have 12 M|N|O|P style operands, 4
M|N|O style operands and 1 M|N style operand, i.e. in total 17?), but we pick
just 8 operands from this (the ones with largest values of the bitmask), so
(H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B).
Next time bb12 is processed and it's predicate is set to HB.
On the next big round, nothing but these 3 bbs change, bb10 is set to
the same predicate as bb11 previously, the oring of HB predicate into this
doesn't change anything, either H or B condition is already in all the
predicates.  Then bb11 is processed, and unlike previous case there is now one
additional executable edge with HB predicate (the same as from bb8), but this
time the computed (incomplete) set of predicate is almost like the previous
one, but does contain additionally (H|A) term and (G|E|C|B) term fell out, as
there was no space for it.  Then bb12 again copies bb10 predicate, the one
without (H|A) and with (G|E|C|B).  Then next time bb10 is processed again,
and (HB) is being ored with the current bb11 predicate, including (H|A),
excluding (G|E|C|B), the result is the same as that bb11 predicate.  Then bb11
is processed and as a result get's the old big predicate, the one with
(G|E|C|B) term and without (H|A) and that is the start of the oscillation.

The interesting thing is e.g. that the
(H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B)
ored with (HB) gives the
(H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(H|A)(G|F|D|B)(G|F|C|B)(G|E|D|B)
result, while (HB) ored with
(H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B)
gives the second operand.

So, IMHO easiest fix would be just to punt whenever we need during or_predicate
more than 8 terms - turn it into a true_predicate, then we will get guaranteed
termination, the drawback would be that in some cases we could have more true
predicates than previously.

Or we could e.g. always compute the or in infinite precision (or say with say
256 terms and punt into true predicate if we reach that) and only at the end of
processing some particular bb pick some subset of at most 8 bitmasks some way
(either prefer the highest values of the bitmask ever appearing there, or say
prefer terms that were already present among the predicates on the bb
previously (if any) and only pick the highest values otherwise, etc.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #14 from Jakub Jelinek jakub at gcc dot gnu.org ---
Important thing I forgot to mention is the order or predecessors, bb10 has
first bb9 and then bb11, bb11 has first bb4, then bb12, then bb5, bb6, bb8.

The oscillation is there really just because of the first and second processing
of bb11, where first time we've ored
{0x100,4,0}, bb12 wasn't processed yet, {0x80,0x40,4,0}, {0x20,0x10,4,0},
{0x200,8,0} and the second time:
{0x100,4,0}, {0x200,8,0}, {0x80,0x40,4,0}, {0x20,0x10,4,0}, {0x200,8,0}
{0x100,4,0}
So the result was first time
{0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0},
second time {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0}.

{0x200,8,0} | {0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0}
gives the second operand.
{0x100,4,0} | {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0}
gives also the second operand and oring into it {0x80,0x40,4,0},
{0x20,0x10,4,0}
and {0x200,8,0} doesn't change anything.
{0x200,8,0} | {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0}
gives the second operand.
{0x100,4,0} | {0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0}
gives the second operand and oring into it {0x80,0x40,4,0}, {0x20,0x10,4,0}
and {0x200,8,0} doesn't change anything.

The oscillation is there because we always get the one before previous
iteration's predicate of bb10 through bb12.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #15 from Jakub Jelinek jakub at gcc dot gnu.org ---
Untested patch to turn things into true predicate when running out of space is:
--- gcc/ipa-inline-analysis.c.jj2014-02-03 08:53:12.0 +0100
+++ gcc/ipa-inline-analysis.c2014-02-05 08:01:57.093878954 +0100
@@ -310,7 +310,7 @@ add_clause (conditions conditions, struc
   if (false_predicate_p (p))
 return;

-  /* No one should be sily enough to add false into nontrivial clauses.  */
+  /* No one should be silly enough to add false into nontrivial clauses.  */
   gcc_checking_assert (!(clause  (1  predicate_false_condition)));

   /* Look where to insert the clause.  At the same time prune out
@@ -372,9 +372,13 @@ add_clause (conditions conditions, struc
 }


-  /* We run out of variants.  Be conservative in positive direction.  */
+  /* We run out of variants.  Conservatively turn clause into true
+ predicate.  */
   if (i2 == MAX_CLAUSES)
-return;
+{
+  p-clause[0] = 0;
+  return;
+}
   /* Keep clauses in decreasing order. This makes equivalence testing easy. 
*/
   p-clause[i2 + 1] = 0;
   if (insert_here = 0)
@@ -412,6 +416,8 @@ and_predicates (conditions conditions,
 {
   gcc_checking_assert (i  MAX_CLAUSES);
   add_clause (conditions, out, p2-clause[i]);
+  if (true_predicate_p (out))
+break;
 }
   return out;
 }
@@ -459,6 +465,8 @@ or_predicates (conditions conditions,
   {
 gcc_checking_assert (i  MAX_CLAUSES  j  MAX_CLAUSES);
 add_clause (conditions, out, p-clause[i] | p2-clause[j]);
+if (true_predicate_p (out))
+  return out;
   }
   return out;
 }
@@ -1035,7 +1043,7 @@ inline_node_removal_hook (struct cgraph_
   memset (info, 0, sizeof (inline_summary_t));
 }

-/* Remap predicate P of former function to be predicate of duplicated
functoin.
+/* Remap predicate P of former function to be predicate of duplicated
function.
POSSIBLE_TRUTHS is clause of possible truths in the duplicated node,
INFO is inline summary of the duplicated node.  */

and fixes the testcase.  Yet another possibility would be to keep
add_clause/{and,or}_predicates as is and just add comment that the oscillation
is possible due to using only a subset of the predicates, and during
compute_bb_predicates just wait until all basic blocks have bb-aux set, then
do a couple of iterations of the main loop after that and finally either just
terminate, or change the oscillating bb's into true predicate.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-03 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P1


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-02 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #9 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 32014
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32014action=edit
A patch

I am testing this patch.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-02 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #32014|0   |1
is obsolete||

--- Comment #10 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 32015
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32015action=edit
An updated patch


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-02 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #11 from H.J. Lu hjl.tools at gmail dot com ---
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00035.html


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #2 from H.J. Lu hjl.tools at gmail dot com ---
Still unfixed with r207387.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #32006|0   |1
is obsolete||

--- Comment #3 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 32011
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32011action=edit
A smaller testcase

It hangs at -O2 on Linux/x86-64.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Seems it is oscillating, but I don't see anything wrong from quick look at the
IL.  So, if the algorithm doesn't have guaranteed termination, it should just
have some maximum number of iteration and give up if it doesn't terminate till
then.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #5 from H.J. Lu hjl.tools at gmail dot com ---
This in compute_bb_predicates

 while (!done)
{
  done = true;
  FOR_EACH_BB_FN (bb, my_function)
{
  struct predicate p = false_predicate ();
  edge e;
  edge_iterator ei;
  FOR_EACH_EDGE (e, ei, bb-preds)
{
  if (e-src-aux)
{
  struct predicate this_bb_predicate
= *(struct predicate *) e-src-aux;
  if (e-aux)
this_bb_predicate
  = and_predicates (summary-conds, this_bb_predicate,
(struct predicate *) e-aux);
  p = or_predicates (summary-conds, p, this_bb_predicate);
  if (true_predicate_p (p))
break;
}
}
  if (false_predicate_p (p))
gcc_assert (!bb-aux);
  else
{
  if (!bb-aux)
{
  done = false;

Set to false on

(gdb) call debug_bb (bb)
bb 13:
ABNORMAL_DISPATCHER (0);

Should it call get_abnormal_succ_dispatcher to check here?

  bb-aux = pool_alloc (edge_predicate_pool);
  *((struct predicate *) bb-aux) = p; 
}
  else if (!predicates_equal_p (p, (struct predicate *) bb-aux))
{
  done = false;
  *((struct predicate *) bb-aux) = p; 
}
}
}
}

goes into an infinite loop.


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #6 from H.J. Lu hjl.tools at gmail dot com ---
Like this:

diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 9a4c6df..991a10b 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -1881,9 +1881,12 @@ compute_bb_predicates (struct cgraph_node *node,
 {
   if (!bb-aux)
 {
-  done = false;
-  bb-aux = pool_alloc (edge_predicate_pool);
-  *((struct predicate *) bb-aux) = p;
+  if (!get_abnormal_succ_dispatcher (bb))
+{
+  done = false;
+  bb-aux = pool_alloc (edge_predicate_pool);
+  *((struct predicate *) bb-aux) = p;
+}
 }
   else if (!predicates_equal_p (p, (struct predicate *) bb-aux))
 {


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #32011|0   |1
is obsolete||

--- Comment #7 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 32012
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32012action=edit
Even smaller testcase


[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231

2014-02-01 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013

--- Comment #8 from Jakub Jelinek jakub at gcc dot gnu.org ---
That doesn't look correct, that will not clear done on any basic block that has
abnormal successor with ABNORMAL_DISPATCHER, which is a lot of basic blocks. 
If the algorithm isn't supposed to follow abnormal edges, then it shouldn't
follow them, not have some hack for the abnormal dispatcher.