>
> I was thinking on only testing against param_max_unswitch_insns?
> Given we're only unswitching on innermost predicates but we
> try to accurately cost remaining stmts on each side we could
> improve the estimate by considering the outer loop to be
> duplicated but not the innermost one, so check
>
> candidate_size - innermost_loop_size > (unsigned)
> param_max_unswitch_insns
>
> ? That said, your estimate_loop_insns is quadratic in loop depth
> (but that's bound to param_max_unswitch_depth and with this new
> size check, so possibly OK ...). In your patch you can at least
> avoid one estimate_loop_insns () call, with my proposal you'd hoist
> innermost_loop_size and keep one.
Changed and yes, *candidate_size - innermost_loop_size* is more clear.
> There is of course the possibility that the outer loops contain
> the same predicates (and we'd properly cost that I think), not
> sure how likely and how important that is to handle (in extreme
> there'll be no actual code duplication but the loop control).
> Thus instead of restricting outer loop here we could handle
> this in tree_unswitch_single_loop and handle the case
> when predicate == 0 by recursing to loop->inner (possibly
> only when we blowed budged - we'll have to restore the
> original budget then, of course).
That's a larger restructuring which could be independent of this patch.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
When unswitching predicates from the innermost loop, hoisting the
unswitch out to an outer loop duplicates only the outer-loop bodies;
the innermost loop is shared. Estimate the cost as
candidate_size - innermost_size and stop selecting an outer loop once
that exceeds param_max_unswitch_insns. innermost_size is hoisted out
of the walk so estimate_loop_insns is called once per level.
gcc/ChangeLog:
* tree-ssa-loop-unswitch.cc (estimate_loop_insns): New function.
(init_loop_unswitch_info): Do not select an outer loop for
unswitching when the duplicated outer-body size exceeds
param_max_unswitch_insns.
gcc/testsuite/ChangeLog:
* gcc.dg/loop-unswitch-19.c: New test.
---
gcc/testsuite/gcc.dg/loop-unswitch-19.c | 43 +++++++++++++++++++++++
gcc/tree-ssa-loop-unswitch.cc | 46 ++++++++++++++++++++++---
2 files changed, 84 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-19.c
diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-19.c
b/gcc/testsuite/gcc.dg/loop-unswitch-19.c
new file mode 100644
index 00000000000..4c84e226580
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/loop-unswitch-19.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-unswitch-optimized" } */
+
+extern volatile int sink;
+
+void
+foo (int flag, int n, int m)
+{
+ for (int i = 0; i < n; ++i)
+ {
+ /* Pad the outer loop to make it noticeably larger than the inner. */
+ sink += i;
+ sink += 2;
+ sink += 3;
+ sink += 4;
+ sink += 5;
+ sink += 6;
+ sink += 7;
+ sink += 8;
+ sink += 9;
+ sink += 10;
+ sink += 11;
+ sink += 12;
+ sink += 13;
+ sink += 14;
+ sink += 15;
+ sink += 16;
+ sink += 17;
+ sink += 18;
+ sink += 19;
+ sink += 20;
+
+ for (int j = 0; j < m; ++j)
+ if (flag)
+ sink += j;
+ }
+}
+
+/* We should still unswitch, but stay on the inner loop because the
+ outer-only body (the part that would be duplicated by hoisting the
+ unswitching) exceeds param_max_unswitch_insns. */
+/* { dg-final { scan-tree-dump "unswitching loop" "unswitch" } } */
+/* { dg-final { scan-tree-dump-not "unswitching outer loop" "unswitch" } } */
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index 96680bc64ea..644753dd7a8 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -250,6 +250,21 @@ set_predicates_for_bb (basic_block bb,
vec<unswitch_predicate *> predicates)
bb_predicates->safe_push (predicates);
}
+/* Estimate number of instructions in LOOP using eni_size_weights. */
+
+static unsigned
+estimate_loop_insns (class loop *loop)
+{
+ unsigned insns = 0;
+ basic_block *body = get_loop_body (loop);
+ for (unsigned i = 0; i < loop->num_nodes; i++)
+ for (gimple_stmt_iterator gsi = gsi_start_bb (body[i]);
+ !gsi_end_p (gsi); gsi_next (&gsi))
+ insns += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);
+ free (body);
+ return insns;
+}
+
/* Initialize LOOP information reused during the unswitching pass.
Return total number of instructions in the loop. Adjusts LOOP to
the outermost loop all candidates are invariant in. */
@@ -262,13 +277,35 @@ init_loop_unswitch_info (class loop *&loop,
unswitch_predicate *&hottest,
basic_block *bbs = get_loop_body (loop);
- /* Unswitch only nests with no sibling loops. */
+ /* Unswitch only nests with no sibling loops. Since predicates come from
+ the innermost loop, only the outer-loop bodies get duplicated by hoisting
+ the unswitching out; the innermost loop is shared in the cost estimate.
*/
class loop *outer_loop = loop;
unsigned max_depth = param_max_unswitch_depth;
+ unsigned innermost_size = estimate_loop_insns (loop);
while (loop_outer (outer_loop)->num != 0
- && !loop_outer (outer_loop)->inner->next
- && --max_depth != 0)
- outer_loop = loop_outer (outer_loop);
+ && !loop_outer (outer_loop)->inner->next)
+ {
+ if (--max_depth == 0)
+ break;
+
+ class loop *candidate = loop_outer (outer_loop);
+ unsigned candidate_size = estimate_loop_insns (candidate);
+
+ if (candidate_size - innermost_size
+ > (unsigned) param_max_unswitch_insns)
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, find_loop_location (loop),
+ "Not unswitching outer loop, duplicated size %u "
+ "exceeds max-unswitch-insns %u\n",
+ candidate_size - innermost_size,
+ (unsigned) param_max_unswitch_insns);
+ break;
+ }
+
+ outer_loop = candidate;
+ }
hottest = NULL;
hottest_bb = NULL;
/* Find all unswitching candidates in the innermost loop. */
@@ -1678,4 +1715,3 @@ make_pass_tree_unswitch (gcc::context *ctxt)
return new pass_tree_unswitch (ctxt);
}
-
--
2.34.1