Re: [PATCH] Improve DOM's optimization of control statements

2015-10-07 Thread Jeff Law

On 10/05/2015 03:02 AM, Richard Biener wrote:

On Fri, Oct 2, 2015 at 9:30 PM, Jeff Law <l...@redhat.com> wrote:

On 10/02/2015 05:15 AM, Renlin Li wrote:


Hi Jeff,

Your patch causes an ICE regression.
The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
aarch64-none-elf target when compiling the test case with '-Os' flag.

A quick check shows, the cfg has been changed, but the loop information
is not updated. Thus the information about the number of basic block in
a loop is not reliable.

Could you please have a look?


As I mentioned, when we collapse a conditional inside a loop, we may change
the # of nodes in a loop which edges are exit edges and possibly other
stuff.  So we need to mark loops as needing fixups.

Verified this fixes the aarch64-elf regression and did a bootstrap &
regression test on x86_64-linux-gnu.

Installed on the trunk.

jeff

commit 992d281b2d1ba53a49198db44fee92a505e16f5d
Author: Jeff Law <l...@tor.usersys.redhat.com>
Date:   Fri Oct 2 15:22:04 2015 -0400

 Re: [PATCH] Improve DOM's optimization of control statements

 * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
 fixups.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3f7561a..e541df3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-10-02  Jeff Law  <l...@redhat.com>
+
+   * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
+   fixups.
+
  2015-10-02  Uros Bizjak  <ubiz...@gmail.com>

 * system.h (ROUND_UP): New macro definition.
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index a8b7038..d940816 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1843,6 +1843,12 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator
si,
   /* Delete threads that start at BB.  */
   remove_jump_threads_starting_at (bb);

+ /* If BB is in a loop, then removing an outgoing edge from BB
+may cause BB to move outside the loop, changes in the
+loop exit edges, etc.  So note that loops need fixing.  */
+ if (bb_loop_depth (bb) > 0)
+   loops_state_set (LOOPS_NEED_FIXUP);
+


I would rather do this in remove_ctrl_stmt_and_useless_edges and only
if taken_edge is a loop exit.  loop fixup is a pretty big hammer which
we should avoid at all cost.

So please try to be more specific on the cases you invoke it.
What's probably the most interesting is we don't actually have 
EDGE_LOOP_EXIT set in DOM.   So we can't use that, but that also implies 
it doesn't need updating.  Thankfully, there's an alternate test we can 
use instead.


And after more ponderings, I'm pretty sure the only case that's of 
concern right now is nodes moving out of the loop, which only happens 
when we have a BB with a loop exit edge and we delete the *other* edges. 
 We've got simple latches & preheaders and I don't think we destroy 
that property.   Both properties also simplify the things we need look for.


So I've minimized use of the hammer to just the case where blocks are 
going to be moving out of the loop.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu and 
verified the ARM test which originally spurred this change continues to 
work.


Installed on the trunk.

Jeff
* tree-ssa-dom.c (optimize_stmt): Don't set LOOPS_NEED_FIXUP here.
* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges): Do it
here instead.  Tighten test to avoid setting LOOPS_NEED_FIXUP 
unnecessarily.

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 941087d..38cceff 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1848,12 +1848,6 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si,
  FOR_EACH_EDGE (e, ei, bb->succs)
remove_jump_threads_including (e);
 
- /* If BB is in a loop, then removing an outgoing edge from BB
-may cause BB to move outside the loop, changes in the
-loop exit edges, etc.  So note that loops need fixing.  */
- if (bb_loop_depth (bb) > 0)
-   loops_state_set (LOOPS_NEED_FIXUP);
-
  /* Now clean up the control statement at the end of
 BB and remove unexecutable edges.  */
  remove_ctrl_stmt_and_useless_edges (bb, taken_edge->dest);
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 26b199b..e426c1d 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -300,6 +300,17 @@ remove_ctrl_stmt_and_useless_edges (basic_block bb, 
basic_block dest_bb)
   else
ei_next ();
 }
+
+  /* If the remaining edge is a loop exit, there must have
+ a removed edge that was not a loop exit.
+
+ In that case BB and possibly other blocks were previously
+ in the loop, but are now outside the loop.  Thus, we need
+ to update the loop structures. 

Re: [PATCH] Improve DOM's optimization of control statements

2015-10-07 Thread Andreas Schwab
On powerpc:

FAIL: gcc.c-torture/compile/pr52073.c   -O2  (internal compiler error)

/daten/gcc/gcc-20151006/gcc/testsuite/gcc.c-torture/compile/pr52073.c:6:1: 
internal compiler error: in duplicate_thread_path, at 
tree-ssa-threadupdate.c:2446.
0x10874ab7 duplicate_thread_path.
../../gcc/tree-ssa-threadupdate.c:2445.
0x10874ab7 thread_through_all_blocks(bool).
../../gcc/tree-ssa-threadupdate.c:2632.
0x107b0ea7 execute.
../../gcc/tree-ssa-dom.c:622.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Improve DOM's optimization of control statements

2015-10-07 Thread Jeff Law

On 10/07/2015 04:03 PM, Andreas Schwab wrote:

On powerpc:

FAIL: gcc.c-torture/compile/pr52073.c   -O2  (internal compiler error)

/daten/gcc/gcc-20151006/gcc/testsuite/gcc.c-torture/compile/pr52073.c:6:1: 
internal compiler error: in duplicate_thread_path, at 
tree-ssa-threadupdate.c:2446.
0x10874ab7 duplicate_thread_path.
../../gcc/tree-ssa-threadupdate.c:2445.
0x10874ab7 thread_through_all_blocks(bool).
../../gcc/tree-ssa-threadupdate.c:2632.
0x107b0ea7 execute.
../../gcc/tree-ssa-dom.c:622.

Update your tree.  This should have been fixed by:
2015-10-06  Jeff Law  

PR tree-optimization/67816

Jeff



Re: [PATCH] Improve DOM's optimization of control statements

2015-10-06 Thread Jeff Law

On 10/05/2015 03:02 AM, Richard Biener wrote:

+ /* If BB is in a loop, then removing an outgoing edge from BB
+may cause BB to move outside the loop, changes in the
+loop exit edges, etc.  So note that loops need fixing.  */
+ if (bb_loop_depth (bb) > 0)
+   loops_state_set (LOOPS_NEED_FIXUP);
+


I would rather do this in remove_ctrl_stmt_and_useless_edges and only
if taken_edge is a loop exit.  loop fixup is a pretty big hammer which
we should avoid at all cost.
remove_ctrl_stmt_and_useless_edges is used internally when duplicating 
blocks for jump threading, so we'd need to audit for that case as well.




So please try to be more specific on the cases you invoke it.
I'd pondered this and was still working through whether or not 
triggering only when one of the outgoing edges was "interesting" from a 
loop structure standpoint.  I hadn't tested that yet and wanted to get 
the regression resolved, hence the large hammer.



jeff



Re: [PATCH] Improve DOM's optimization of control statements

2015-10-05 Thread Richard Biener
On Fri, Oct 2, 2015 at 9:30 PM, Jeff Law <l...@redhat.com> wrote:
> On 10/02/2015 05:15 AM, Renlin Li wrote:
>>
>> Hi Jeff,
>>
>> Your patch causes an ICE regression.
>> The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
>> aarch64-none-elf target when compiling the test case with '-Os' flag.
>>
>> A quick check shows, the cfg has been changed, but the loop information
>> is not updated. Thus the information about the number of basic block in
>> a loop is not reliable.
>>
>> Could you please have a look?
>
> As I mentioned, when we collapse a conditional inside a loop, we may change
> the # of nodes in a loop which edges are exit edges and possibly other
> stuff.  So we need to mark loops as needing fixups.
>
> Verified this fixes the aarch64-elf regression and did a bootstrap &
> regression test on x86_64-linux-gnu.
>
> Installed on the trunk.
>
> jeff
>
> commit 992d281b2d1ba53a49198db44fee92a505e16f5d
> Author: Jeff Law <l...@tor.usersys.redhat.com>
> Date:   Fri Oct 2 15:22:04 2015 -0400
>
> Re: [PATCH] Improve DOM's optimization of control statements
>
> * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
> fixups.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3f7561a..e541df3 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-10-02  Jeff Law  <l...@redhat.com>
> +
> +   * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
> +   fixups.
> +
>  2015-10-02  Uros Bizjak  <ubiz...@gmail.com>
>
> * system.h (ROUND_UP): New macro definition.
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index a8b7038..d940816 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -1843,6 +1843,12 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator
> si,
>   /* Delete threads that start at BB.  */
>   remove_jump_threads_starting_at (bb);
>
> + /* If BB is in a loop, then removing an outgoing edge from BB
> +may cause BB to move outside the loop, changes in the
> +loop exit edges, etc.  So note that loops need fixing.  */
> + if (bb_loop_depth (bb) > 0)
> +   loops_state_set (LOOPS_NEED_FIXUP);
> +

I would rather do this in remove_ctrl_stmt_and_useless_edges and only
if taken_edge is a loop exit.  loop fixup is a pretty big hammer which
we should avoid at all cost.

So please try to be more specific on the cases you invoke it.

Thanks,
Richard.

>   /* Now clean up the control statement at the end of
>  BB and remove unexecutable edges.  */
>   remove_ctrl_stmt_and_useless_edges (bb, taken_edge->dest);
>


Re: [PATCH] Improve DOM's optimization of control statements

2015-10-02 Thread Renlin Li

Hi Jeff,

Your patch causes an ICE regression.
The test case is " gcc.c-torture/compile/pr27087.c", I observed it on 
aarch64-none-elf target when compiling the test case with '-Os' flag.


A quick check shows, the cfg has been changed, but the loop information 
is not updated. Thus the information about the number of basic block in 
a loop is not reliable.


Could you please have a look?

Regards,
Renlin

On 30/09/15 21:28, Jeff Law wrote:

Until now DOM has had to be very conservative with handling control
statements with known conditions.  This as been an unfortunate side
effect of the interaction between removing edges and recycling names via
the SSA_NAME manager.

Essentially DOM would have to leave control statements alone.  So you'd
see stuff like

if (0 == 0)

left around by DOM.  The jump threader would thankfully come along and
optimize that as a jump thread.  But that's terribly inefficient, not to
mention it creates unnecessary churn in the CFG and SSA_NAMEs.

By optimizing that directly in DOM, including removing whatever edges
are not executable, we no longer have to rely on jump threading to
handle that case.  Less churn in the CFG & SSA_NAMEs.   There's also
some chance for secondary optimizations with fewer edges left in the CFG
for DOM to consider.

Unfortunately, the churn caused by jump threading made it excessively
difficult to analyze before/after dumps.  Sadly, you can have the same
code, but if the SSA_NAMEs have changed, that impacts coalescing as we
leave SSA.  Churn in the CFG changes labels/jumps, often without
changing the actual structure, etc.

I did some tests with valgrind to evaluate branching behaviour
before/after effects on the resulting code and those effects were tiny,
in the I doubt you could measure them range.  That was expected since
what we're really doing here is just capturing the optimization earlier.

I had a couple more tests, but they were lost in a bit of idiocy.  The
test included is the one I had a second copy of lying around.

Bootstrapped and regression tested on x86_64-linux-gnu.  Installed on
the trunk.

Jeff




Re: [PATCH] Improve DOM's optimization of control statements

2015-10-02 Thread Jeff Law

On 10/02/2015 05:15 AM, Renlin Li wrote:

Hi Jeff,

Your patch causes an ICE regression.
The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
aarch64-none-elf target when compiling the test case with '-Os' flag.

A quick check shows, the cfg has been changed, but the loop information
is not updated. Thus the information about the number of basic block in
a loop is not reliable.

Could you please have a look?

Yup.  Will do.   Thanks for letting me know.

jeff



Re: [PATCH] Improve DOM's optimization of control statements

2015-10-02 Thread Jeff Law

On 10/02/2015 05:15 AM, Renlin Li wrote:

Hi Jeff,

Your patch causes an ICE regression.
The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
aarch64-none-elf target when compiling the test case with '-Os' flag.

A quick check shows, the cfg has been changed, but the loop information
is not updated. Thus the information about the number of basic block in
a loop is not reliable.

Could you please have a look?
Appears to be pretty simple.  If we collapse a conditional inside a 
loop, then we need to set the loop state as needing fixups.


In this specific case we have a conditional where one path eventually 
leads back to the loop latch block, the other path exits the loop.  We 
statically determine the conditional will always take us to the loop exit.


That has the side effect of making the block with the collapsed 
conditional no longer part of the loop, it's actually part of the exit 
path.  That changes the number of nodes in the loop, what edge(s) are 
the exit path(s) and possibly other stuff.


I'm running a fix through testing now.

Thanks,
jeff



Re: [PATCH] Improve DOM's optimization of control statements

2015-10-02 Thread Jeff Law

On 10/02/2015 05:15 AM, Renlin Li wrote:

Hi Jeff,

Your patch causes an ICE regression.
The test case is " gcc.c-torture/compile/pr27087.c", I observed it on
aarch64-none-elf target when compiling the test case with '-Os' flag.

A quick check shows, the cfg has been changed, but the loop information
is not updated. Thus the information about the number of basic block in
a loop is not reliable.

Could you please have a look?
As I mentioned, when we collapse a conditional inside a loop, we may 
change the # of nodes in a loop which edges are exit edges and possibly 
other stuff.  So we need to mark loops as needing fixups.


Verified this fixes the aarch64-elf regression and did a bootstrap & 
regression test on x86_64-linux-gnu.


Installed on the trunk.

jeff
commit 992d281b2d1ba53a49198db44fee92a505e16f5d
Author: Jeff Law <l...@tor.usersys.redhat.com>
Date:   Fri Oct 2 15:22:04 2015 -0400

    Re: [PATCH] Improve DOM's optimization of control statements

* tree-ssa-dom.c (optimize_stmt): Note when loop structures need
fixups.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3f7561a..e541df3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-10-02  Jeff Law  <l...@redhat.com>
+
+   * tree-ssa-dom.c (optimize_stmt): Note when loop structures need
+   fixups.
+
 2015-10-02  Uros Bizjak  <ubiz...@gmail.com>
 
* system.h (ROUND_UP): New macro definition.
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index a8b7038..d940816 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1843,6 +1843,12 @@ optimize_stmt (basic_block bb, gimple_stmt_iterator si,
  /* Delete threads that start at BB.  */
  remove_jump_threads_starting_at (bb);
 
+ /* If BB is in a loop, then removing an outgoing edge from BB
+may cause BB to move outside the loop, changes in the
+loop exit edges, etc.  So note that loops need fixing.  */
+ if (bb_loop_depth (bb) > 0)
+   loops_state_set (LOOPS_NEED_FIXUP);
+
  /* Now clean up the control statement at the end of
 BB and remove unexecutable edges.  */
  remove_ctrl_stmt_and_useless_edges (bb, taken_edge->dest);