[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #13 from rguenther at suse dot de rguenther at suse dot de --- On Tue, 1 Oct 2013, law at redhat dot com wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #12 from Jeffrey A. Law law at redhat dot com --- Re: Not creating loops with multiple entries, no doubt that's bad. It would be nice however, to expose loop nesting. ie, prior to threading it looks like one bug fugly loop. A bit of threading can sometimes expose a reasonable nested loop structure. True, still detecting the multiple entry case is important. I haven't thought much about what additional updating we'd need for that, but it's in the back of my mind. Right now we're supposed to be rejecting these jump threading requests, but some might be sliding through. The loop machinery can do most of it itself nowadays (remove no longer existing loops and discover new loops). But as we are preserving more and more metadata attached to loops it is important to keep a loop identifiable by the same loop header, or if that changes, adjust that loop manually. Grepping for fix_loop_structure: removing loop and flow_loops_find: discovered new loop in the -details dumps can show suspicious drop-and-rediscover-loop-with-different-header events. Re: peeling/unrolling. Given that we don't iterate DOM anymore, there's little risk of completely peeling/unrolling the loop, except for loops which just iterate 1-3 times and are relatively small (we do have size growth limits). But our heuristics for when to peel vs leave it alone are trivial at best and could use some significant improvement. Yeah, I think it's a matter of cost model adjustments. Richard.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #9 from rguenther at suse dot de rguenther at suse dot de --- On Mon, 30 Sep 2013, law at redhat dot com wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #8 from Jeffrey A. Law law at redhat dot com --- Yes, threading is rotating the loop in interesting ways -- I was going to look at that independently of the correctness issue. One of the things I've noticed as I've been laying down some infrastructure for the FSA optimization is much of the work Zdenek did to prevent threading through loop headers and such isn't working as well as we'd like. The basic rule should be that threading through loop headers is ok if 1) it doesn't end up creating loops with multiple entries, 2) it doesn't effectively unroll the loop, though the size constraints in the threading cost model should put up a reasonable limit here, but as we are threading multiple times we eventually ended up peeling N iterations 1) is most important as we cannot handle loops with multiple entries at all. Peeling all loops N times is of course equally bad. Richard.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #10 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Tue Oct 1 07:41:10 2013 New Revision: 203054 URL: http://gcc.gnu.org/viewcvs?rev=203054root=gccview=rev Log: 2013-10-01 Richard Biener rguent...@suse.de PR tree-optimization/58553 * tree-loop-distribution.c (struct partition_s): Add niter member. (classify_partition): Populate niter member for the partition and properly identify whether the relevant store happens before or after the loop exit. (generate_memset_builtin): Use niter member from the partition. (generate_memcpy_builtin): Likewise. * gcc.dg/torture/pr58553.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr58553.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-loop-distribution.c
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Richard Biener rguenth at gcc dot gnu.org --- Fixed.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #12 from Jeffrey A. Law law at redhat dot com --- Re: Not creating loops with multiple entries, no doubt that's bad. It would be nice however, to expose loop nesting. ie, prior to threading it looks like one bug fugly loop. A bit of threading can sometimes expose a reasonable nested loop structure. I haven't thought much about what additional updating we'd need for that, but it's in the back of my mind. Right now we're supposed to be rejecting these jump threading requests, but some might be sliding through. Re: peeling/unrolling. Given that we don't iterate DOM anymore, there's little risk of completely peeling/unrolling the loop, except for loops which just iterate 1-3 times and are relatively small (we do have size growth limits). But our heuristics for when to peel vs leave it alone are trivial at best and could use some significant improvement.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2013-09-30 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #6 from Richard Biener rguenth at gcc dot gnu.org --- Mine. Jeff, the niter code returns the number of latch executions, so adding one is correct if the memset happens before the latch (the code doesn't try to distinguish both cases but it now has to, after I removed the restriction of distributing only exactly loops with a single basic block). I'll fix it.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #7 from Richard Biener rguenth at gcc dot gnu.org --- Testcase that also fails on x86_64: #define MAX_LENGTH 96 #define SEQUENCE_LENGTH 31 static struct { char buf[MAX_LENGTH + 1]; } u1, u2; extern void abort (void); int main () { int i; char c; u1.buf[0] = '\0'; for (i = 0, c = 'A'; i MAX_LENGTH; i++, c++) { u1.buf[i] = 'a'; if (c = 'A' + SEQUENCE_LENGTH) c = 'A'; u2.buf[i] = c; } if (u1.buf[MAX_LENGTH] != '\0') abort (); return 0; } jump threading rotates the loop in interesting ways.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Bug 58553 depends on bug 58554, which changed state. Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #8 from Jeffrey A. Law law at redhat dot com --- Yes, threading is rotating the loop in interesting ways -- I was going to look at that independently of the correctness issue. One of the things I've noticed as I've been laying down some infrastructure for the FSA optimization is much of the work Zdenek did to prevent threading through loop headers and such isn't working as well as we'd like.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #1 from jgreenhalgh at gcc dot gnu.org --- Created attachment 30918 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30918action=edit Output of dom1
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #2 from Jeffrey A. Law law at redhat dot com --- James. Look in the .ldist dump. In particular look at that memset call. We're writing off the end of the structure. Now to walk backwards and figure out why :-)
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Andrew Pinski pinskia at gcc dot gnu.org changed: What|Removed |Added Depends on||58554 --- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org --- This sounds like bug 58554.
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 --- Comment #4 from Jeffrey A. Law law at redhat dot com --- Andrew. Yes it does. I've never looked at the ldist code, but the dump seems a bit strange: Analyzing # of iterations of loop 3 exit condition [1, + , 1](no_overflow) != 96 bounds on difference of bases: 95 ... 95 result: # of iterations 95, bounded by 95 __builtin_memset (MEM[(void *)u1 + 1B], 97, 96); So it determined the right iteration count but mucked up the count in the call to memset ?!? Weird
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Jeffrey A. Law law at redhat dot com changed: What|Removed |Added CC||pthaugen at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law law at redhat dot com --- *** Bug 58554 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Bug 58553 depends on bug 58554, which changed state. Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553 Bug 58553 depends on bug 58554, which changed state. Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554 What|Removed |Added Status|RESOLVED|REOPENED Resolution|DUPLICATE |---