[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org --- Fixed.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #16 from Pat Haugen pthaugen at gcc dot gnu.org --- I tried the patch from Comment 15 and was able to build/run the benchmark successfully.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #17 from Jan Hubicka hubicka at gcc dot gnu.org --- Author: hubicka Date: Thu Feb 6 07:39:24 2014 New Revision: 207529 URL: http://gcc.gnu.org/viewcvs?rev=207529root=gccview=rev Log: PR middle-end/60013 * ipa-inline-analysis.c (compute_bb_predicates): Ensure monotonicity of the dataflow. * gcc.dg/pr60013.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/pr60013.c Modified: trunk/gcc/ChangeLog trunk/gcc/ipa-inline-analysis.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #12 from Jakub Jelinek jakub at gcc dot gnu.org --- Slightly more reduced testcase for -O2: typedef long int jmp_buf[64]; extern int _setjmp (jmp_buf) __attribute__ ((__nothrow__)); struct S { int a, b, c; }; extern struct S *baz (struct S *); static jmp_buf j; static inline int bar (int b, int d) { return (b d) 0; } struct S * foo (int a, struct S *b, struct S *c, struct S *d) { if (b-a == 0) { switch (a) { case 8: return baz (b); case 7: bar (b-c, c-b); return 0; case 6: case 5: case 4: return baz (c); case 3: case 2: return baz (d); } return 0; } if (b-a == 1) { if (baz (c)) return c; else if (_setjmp (j)) baz (b); } return 0; }
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #13 from Jakub Jelinek jakub at gcc dot gnu.org --- Ok, here is what happens on the reduced testcase. We have 9 dynamic conditions and 14 basic blocks. Let's use letters for the dynamic conditions: A (op1[ref offset: 0] == 0) B (op1[ref offset: 0] != 0) C (op0 = 2) D (op0 = 3) E (op0 = 4) F (op0 = 6) G (op0 == 8) H (op1[ref offset: 0] == 1) I (op1[ref offset: 0] != 1) Most of the basic blocks have only a single predecessor and aren't part of any kind of loop, so they have fixed set of predicates after first iteration, and bb13 has one pred edge (the default: edge of switch) with true predicate, so it is always true in the end, so the only really interesting basic blocks are bb10, bb11 (ABNORMAL_DISPATCHER) and bb12. The preds of bb10 are bb11 and bb9, preds of bb11 are bb4, bb5, bb6 and bb12 and pred of bb12 is bb10. The predicates for the bbs that really don't change after first iterations are: bb4 GA bb5 FEA bb6 DCA bb9 HB and the problematic basic blocks are always processed in increasing bb-index, i.e. bb10, bb11, bb12. So bb10 get's HB. Then bb11 is processed, bb12 still has no predicates computed, and the DNF (GA)|(FEA)|(DCA)|(HB) is being converted into CNF, but there is a cap on the number of s (8), thus we don't compute the whole CNF (which would need to have 12 M|N|O|P style operands, 4 M|N|O style operands and 1 M|N style operand, i.e. in total 17?), but we pick just 8 operands from this (the ones with largest values of the bitmask), so (H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B). Next time bb12 is processed and it's predicate is set to HB. On the next big round, nothing but these 3 bbs change, bb10 is set to the same predicate as bb11 previously, the oring of HB predicate into this doesn't change anything, either H or B condition is already in all the predicates. Then bb11 is processed, and unlike previous case there is now one additional executable edge with HB predicate (the same as from bb8), but this time the computed (incomplete) set of predicate is almost like the previous one, but does contain additionally (H|A) term and (G|E|C|B) term fell out, as there was no space for it. Then bb12 again copies bb10 predicate, the one without (H|A) and with (G|E|C|B). Then next time bb10 is processed again, and (HB) is being ored with the current bb11 predicate, including (H|A), excluding (G|E|C|B), the result is the same as that bb11 predicate. Then bb11 is processed and as a result get's the old big predicate, the one with (G|E|C|B) term and without (H|A) and that is the start of the oscillation. The interesting thing is e.g. that the (H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B) ored with (HB) gives the (H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(H|A)(G|F|D|B)(G|F|C|B)(G|E|D|B) result, while (HB) ored with (H|G|F|D)(H|G|F|C)(H|G|E|D)(H|G|E|C)(G|F|D|B)(G|F|C|B)(G|E|D|B)(G|E|C|B) gives the second operand. So, IMHO easiest fix would be just to punt whenever we need during or_predicate more than 8 terms - turn it into a true_predicate, then we will get guaranteed termination, the drawback would be that in some cases we could have more true predicates than previously. Or we could e.g. always compute the or in infinite precision (or say with say 256 terms and punt into true predicate if we reach that) and only at the end of processing some particular bb pick some subset of at most 8 bitmasks some way (either prefer the highest values of the bitmask ever appearing there, or say prefer terms that were already present among the predicates on the bb previously (if any) and only pick the highest values otherwise, etc.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #14 from Jakub Jelinek jakub at gcc dot gnu.org --- Important thing I forgot to mention is the order or predecessors, bb10 has first bb9 and then bb11, bb11 has first bb4, then bb12, then bb5, bb6, bb8. The oscillation is there really just because of the first and second processing of bb11, where first time we've ored {0x100,4,0}, bb12 wasn't processed yet, {0x80,0x40,4,0}, {0x20,0x10,4,0}, {0x200,8,0} and the second time: {0x100,4,0}, {0x200,8,0}, {0x80,0x40,4,0}, {0x20,0x10,4,0}, {0x200,8,0} {0x100,4,0} So the result was first time {0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0}, second time {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0}. {0x200,8,0} | {0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0} gives the second operand. {0x100,4,0} | {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0} gives also the second operand and oring into it {0x80,0x40,4,0}, {0x20,0x10,4,0} and {0x200,8,0} doesn't change anything. {0x200,8,0} | {0x3a0,0x390,0x360,0x350,0x204,0x1a8,0x198,0x168,0} gives the second operand. {0x100,4,0} | {0x3a0,0x390,0x360,0x350,0x1a8,0x198,0x168,0x158,0} gives the second operand and oring into it {0x80,0x40,4,0}, {0x20,0x10,4,0} and {0x200,8,0} doesn't change anything. The oscillation is there because we always get the one before previous iteration's predicate of bb10 through bb12.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #15 from Jakub Jelinek jakub at gcc dot gnu.org --- Untested patch to turn things into true predicate when running out of space is: --- gcc/ipa-inline-analysis.c.jj2014-02-03 08:53:12.0 +0100 +++ gcc/ipa-inline-analysis.c2014-02-05 08:01:57.093878954 +0100 @@ -310,7 +310,7 @@ add_clause (conditions conditions, struc if (false_predicate_p (p)) return; - /* No one should be sily enough to add false into nontrivial clauses. */ + /* No one should be silly enough to add false into nontrivial clauses. */ gcc_checking_assert (!(clause (1 predicate_false_condition))); /* Look where to insert the clause. At the same time prune out @@ -372,9 +372,13 @@ add_clause (conditions conditions, struc } - /* We run out of variants. Be conservative in positive direction. */ + /* We run out of variants. Conservatively turn clause into true + predicate. */ if (i2 == MAX_CLAUSES) -return; +{ + p-clause[0] = 0; + return; +} /* Keep clauses in decreasing order. This makes equivalence testing easy. */ p-clause[i2 + 1] = 0; if (insert_here = 0) @@ -412,6 +416,8 @@ and_predicates (conditions conditions, { gcc_checking_assert (i MAX_CLAUSES); add_clause (conditions, out, p2-clause[i]); + if (true_predicate_p (out)) +break; } return out; } @@ -459,6 +465,8 @@ or_predicates (conditions conditions, { gcc_checking_assert (i MAX_CLAUSES j MAX_CLAUSES); add_clause (conditions, out, p-clause[i] | p2-clause[j]); +if (true_predicate_p (out)) + return out; } return out; } @@ -1035,7 +1043,7 @@ inline_node_removal_hook (struct cgraph_ memset (info, 0, sizeof (inline_summary_t)); } -/* Remap predicate P of former function to be predicate of duplicated functoin. +/* Remap predicate P of former function to be predicate of duplicated function. POSSIBLE_TRUTHS is clause of possible truths in the duplicated node, INFO is inline summary of the duplicated node. */ and fixes the testcase. Yet another possibility would be to keep add_clause/{and,or}_predicates as is and just add comment that the oscillation is possible due to using only a subset of the predicates, and during compute_bb_predicates just wait until all basic blocks have bb-aux set, then do a couple of iterations of the main loop after that and finally either just terminate, or change the oscillating bb's into true predicate.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P1
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #9 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 32014 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32014action=edit A patch I am testing this patch.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Attachment #32014|0 |1 is obsolete|| --- Comment #10 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 32015 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32015action=edit An updated patch
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #11 from H.J. Lu hjl.tools at gmail dot com --- A patch is posted at http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00035.html
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #2 from H.J. Lu hjl.tools at gmail dot com --- Still unfixed with r207387.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Attachment #32006|0 |1 is obsolete|| --- Comment #3 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 32011 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32011action=edit A smaller testcase It hangs at -O2 on Linux/x86-64.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org --- Seems it is oscillating, but I don't see anything wrong from quick look at the IL. So, if the algorithm doesn't have guaranteed termination, it should just have some maximum number of iteration and give up if it doesn't terminate till then.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #5 from H.J. Lu hjl.tools at gmail dot com --- This in compute_bb_predicates while (!done) { done = true; FOR_EACH_BB_FN (bb, my_function) { struct predicate p = false_predicate (); edge e; edge_iterator ei; FOR_EACH_EDGE (e, ei, bb-preds) { if (e-src-aux) { struct predicate this_bb_predicate = *(struct predicate *) e-src-aux; if (e-aux) this_bb_predicate = and_predicates (summary-conds, this_bb_predicate, (struct predicate *) e-aux); p = or_predicates (summary-conds, p, this_bb_predicate); if (true_predicate_p (p)) break; } } if (false_predicate_p (p)) gcc_assert (!bb-aux); else { if (!bb-aux) { done = false; Set to false on (gdb) call debug_bb (bb) bb 13: ABNORMAL_DISPATCHER (0); Should it call get_abnormal_succ_dispatcher to check here? bb-aux = pool_alloc (edge_predicate_pool); *((struct predicate *) bb-aux) = p; } else if (!predicates_equal_p (p, (struct predicate *) bb-aux)) { done = false; *((struct predicate *) bb-aux) = p; } } } } goes into an infinite loop.
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #6 from H.J. Lu hjl.tools at gmail dot com --- Like this: diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c index 9a4c6df..991a10b 100644 --- a/gcc/ipa-inline-analysis.c +++ b/gcc/ipa-inline-analysis.c @@ -1881,9 +1881,12 @@ compute_bb_predicates (struct cgraph_node *node, { if (!bb-aux) { - done = false; - bb-aux = pool_alloc (edge_predicate_pool); - *((struct predicate *) bb-aux) = p; + if (!get_abnormal_succ_dispatcher (bb)) +{ + done = false; + bb-aux = pool_alloc (edge_predicate_pool); + *((struct predicate *) bb-aux) = p; +} } else if (!predicates_equal_p (p, (struct predicate *) bb-aux)) {
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Attachment #32011|0 |1 is obsolete|| --- Comment #7 from H.J. Lu hjl.tools at gmail dot com --- Created attachment 32012 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32012action=edit Even smaller testcase
[Bug middle-end/60013] [4.9 Regression] Build of 176.gcc from CPU2000 loops in cc1 starting with r207231
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60013 --- Comment #8 from Jakub Jelinek jakub at gcc dot gnu.org --- That doesn't look correct, that will not clear done on any basic block that has abnormal successor with ABNORMAL_DISPATCHER, which is a lot of basic blocks. If the algorithm isn't supposed to follow abnormal edges, then it shouldn't follow them, not have some hack for the abnormal dispatcher.