[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-10-02 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #13 from rguenther at suse dot de rguenther at suse dot de ---
On Tue, 1 Oct 2013, law at redhat dot com wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553
 
 --- Comment #12 from Jeffrey A. Law law at redhat dot com ---
 Re: Not creating loops with multiple entries, no doubt that's bad.
 
 It would be nice however, to expose loop nesting.  ie, prior to threading it
 looks like one bug fugly loop.  A bit of threading can sometimes expose a
 reasonable nested loop structure.

True, still detecting the multiple entry case is important.

   I haven't thought much about what
 additional updating we'd need for that, but it's in the back of my mind.  
 Right
 now we're supposed to be rejecting these jump threading requests, but some
 might be sliding through.

The loop machinery can do most of it itself nowadays (remove no longer
existing loops and discover new loops).  But as we are preserving more
and more metadata attached to loops it is important to keep a loop
identifiable by the same loop header, or if that changes, adjust that
loop manually.

Grepping for fix_loop_structure: removing loop and
flow_loops_find: discovered new loop in the -details dumps can show
suspicious drop-and-rediscover-loop-with-different-header events.

 Re: peeling/unrolling.  Given that we don't iterate DOM anymore, there's 
 little
 risk of completely peeling/unrolling the loop, except for loops which just
 iterate 1-3 times and are relatively small (we do have size growth limits). 
 But our heuristics for when to peel vs leave it alone are trivial at best and
 could use some significant improvement.

Yeah, I think it's a matter of cost model adjustments.

Richard.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-10-01 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #9 from rguenther at suse dot de rguenther at suse dot de ---
On Mon, 30 Sep 2013, law at redhat dot com wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553
 
 --- Comment #8 from Jeffrey A. Law law at redhat dot com ---
 Yes, threading is rotating the loop in interesting ways -- I was going to
 look at that independently of the correctness issue. 
 
 One of the things I've noticed as I've been laying down some infrastructure 
 for
 the FSA optimization is much of the work Zdenek did to prevent threading
 through loop headers and such isn't working as well as we'd like.

The basic rule should be that threading through loop headers is ok
if
  1) it doesn't end up creating loops with multiple entries,
  2) it doesn't effectively unroll the loop, though the size constraints
in the threading cost model should put up a reasonable limit here, but
as we are threading multiple times we eventually ended up peeling N
iterations

1) is most important as we cannot handle loops with multiple entries
at all.  Peeling all loops N times is of course equally bad.

Richard.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-10-01 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #10 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Tue Oct  1 07:41:10 2013
New Revision: 203054

URL: http://gcc.gnu.org/viewcvs?rev=203054root=gccview=rev
Log:
2013-10-01  Richard Biener  rguent...@suse.de

PR tree-optimization/58553
* tree-loop-distribution.c (struct partition_s): Add niter member.
(classify_partition): Populate niter member for the partition
and properly identify whether the relevant store happens before
or after the loop exit.
(generate_memset_builtin): Use niter member from the partition.
(generate_memcpy_builtin): Likewise.

* gcc.dg/torture/pr58553.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr58553.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-loop-distribution.c


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-10-01 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Richard Biener rguenth at gcc dot gnu.org ---
Fixed.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-10-01 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #12 from Jeffrey A. Law law at redhat dot com ---
Re: Not creating loops with multiple entries, no doubt that's bad.

It would be nice however, to expose loop nesting.  ie, prior to threading it
looks like one bug fugly loop.  A bit of threading can sometimes expose a
reasonable nested loop structure.   I haven't thought much about what
additional updating we'd need for that, but it's in the back of my mind.  Right
now we're supposed to be rejecting these jump threading requests, but some
might be sliding through.

Re: peeling/unrolling.  Given that we don't iterate DOM anymore, there's little
risk of completely peeling/unrolling the loop, except for loops which just
iterate 1-3 times and are relatively small (we do have size growth limits). 
But our heuristics for when to peel vs leave it alone are trivial at best and
could use some significant improvement.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-30 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2013-09-30
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #6 from Richard Biener rguenth at gcc dot gnu.org ---
Mine.  Jeff, the niter code returns the number of latch executions, so adding
one is correct if the memset happens before the latch (the code doesn't try
to distinguish both cases but it now has to, after I removed the restriction
of distributing only exactly loops with a single basic block).

I'll fix it.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-30 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #7 from Richard Biener rguenth at gcc dot gnu.org ---
Testcase that also fails on x86_64:

#define MAX_LENGTH 96
#define SEQUENCE_LENGTH 31

static struct {
  char buf[MAX_LENGTH + 1];
} u1, u2;

extern void abort (void);

int main ()
{
  int i;
  char c;

  u1.buf[0] = '\0';
  for (i = 0, c = 'A'; i  MAX_LENGTH; i++, c++)
{
  u1.buf[i] = 'a';
  if (c = 'A' + SEQUENCE_LENGTH)
c = 'A';
  u2.buf[i] = c;
}
  if (u1.buf[MAX_LENGTH] != '\0')
abort ();

  return 0;
}

jump threading rotates the loop in interesting ways.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-30 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug 58553 depends on bug 58554, which changed state.

Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in 
CPU2006 benchmark 445.gobmk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-30 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #8 from Jeffrey A. Law law at redhat dot com ---
Yes, threading is rotating the loop in interesting ways -- I was going to
look at that independently of the correctness issue. 

One of the things I've noticed as I've been laying down some infrastructure for
the FSA optimization is much of the work Zdenek did to prevent threading
through loop headers and such isn't working as well as we'd like.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread jgreenhalgh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #1 from jgreenhalgh at gcc dot gnu.org ---
Created attachment 30918
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30918action=edit
Output of dom1


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #2 from Jeffrey A. Law law at redhat dot com ---
James.  Look in the .ldist dump.  In particular look at that memset call. 
We're writing off the end of the structure.  Now to walk backwards and figure
out why  :-)


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Depends on||58554

--- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org ---
This sounds like bug 58554.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #4 from Jeffrey A. Law law at redhat dot com ---
Andrew.  Yes it does.  I've never looked at the ldist code, but the dump seems
a bit strange:
Analyzing # of iterations of loop 3
  exit condition [1, + , 1](no_overflow) != 96
  bounds on difference of bases: 95 ... 95
  result:
# of iterations 95, bounded by 95

  __builtin_memset (MEM[(void *)u1 + 1B], 97, 96);


So it determined the right iteration count but mucked up the count in the call
to memset ?!?  Weird


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Jeffrey A. Law law at redhat dot com changed:

   What|Removed |Added

 CC||pthaugen at gcc dot gnu.org

--- Comment #5 from Jeffrey A. Law law at redhat dot com ---
*** Bug 58554 has been marked as a duplicate of this bug. ***


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug 58553 depends on bug 58554, which changed state.

Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in 
CPU2006 benchmark 445.gobmk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug 58553 depends on bug 58554, which changed state.

Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in 
CPU2006 benchmark 445.gobmk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|DUPLICATE   |---