On Mon, May 18, 2026 at 2:39 PM liuhongt <[email protected]> wrote:
>
> >
> > I was thinking on only testing against param_max_unswitch_insns?
> > Given we're only unswitching on innermost predicates but we
> > try to accurately cost remaining stmts on each side we could
> > improve the estimate by considering the outer loop to be
> > duplicated but not the innermost one, so check
> >
> > candidate_size - innermost_loop_size > (unsigned)
> > param_max_unswitch_insns
> >
> > ? That said, your estimate_loop_insns is quadratic in loop depth
> > (but that's bound to param_max_unswitch_depth and with this new
> > size check, so possibly OK ...). In your patch you can at least
> > avoid one estimate_loop_insns () call, with my proposal you'd hoist
> > innermost_loop_size and keep one.
>
> Changed and yes, *candidate_size - innermost_loop_size* is more clear.
>
> > There is of course the possibility that the outer loops contain
> > the same predicates (and we'd properly cost that I think), not
> > sure how likely and how important that is to handle (in extreme
> > there'll be no actual code duplication but the loop control).
> > Thus instead of restricting outer loop here we could handle
> > this in tree_unswitch_single_loop and handle the case
> > when predicate == 0 by recursing to loop->inner (possibly
> > only when we blowed budged - we'll have to restore the
> > original budget then, of course).
> That's a larger restructuring which could be independent of this patch.
Agreed. There's an existing TODO there to sort candidates.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
OK.
Thanks,
Richard.
> When unswitching predicates from the innermost loop, hoisting the
> unswitch out to an outer loop duplicates only the outer-loop bodies;
> the innermost loop is shared. Estimate the cost as
> candidate_size - innermost_size and stop selecting an outer loop once
> that exceeds param_max_unswitch_insns. innermost_size is hoisted out
> of the walk so estimate_loop_insns is called once per level.
>
> gcc/ChangeLog:
>
> * tree-ssa-loop-unswitch.cc (estimate_loop_insns): New function.
> (init_loop_unswitch_info): Do not select an outer loop for
> unswitching when the duplicated outer-body size exceeds
> param_max_unswitch_insns.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/loop-unswitch-19.c: New test.
> ---
> gcc/testsuite/gcc.dg/loop-unswitch-19.c | 43 +++++++++++++++++++++++
> gcc/tree-ssa-loop-unswitch.cc | 46 ++++++++++++++++++++++---
> 2 files changed, 84 insertions(+), 5 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-19.c
>
> diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-19.c
> b/gcc/testsuite/gcc.dg/loop-unswitch-19.c
> new file mode 100644
> index 00000000000..4c84e226580
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/loop-unswitch-19.c
> @@ -0,0 +1,43 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-tree-unswitch-optimized" } */
> +
> +extern volatile int sink;
> +
> +void
> +foo (int flag, int n, int m)
> +{
> + for (int i = 0; i < n; ++i)
> + {
> + /* Pad the outer loop to make it noticeably larger than the inner. */
> + sink += i;
> + sink += 2;
> + sink += 3;
> + sink += 4;
> + sink += 5;
> + sink += 6;
> + sink += 7;
> + sink += 8;
> + sink += 9;
> + sink += 10;
> + sink += 11;
> + sink += 12;
> + sink += 13;
> + sink += 14;
> + sink += 15;
> + sink += 16;
> + sink += 17;
> + sink += 18;
> + sink += 19;
> + sink += 20;
> +
> + for (int j = 0; j < m; ++j)
> + if (flag)
> + sink += j;
> + }
> +}
> +
> +/* We should still unswitch, but stay on the inner loop because the
> + outer-only body (the part that would be duplicated by hoisting the
> + unswitching) exceeds param_max_unswitch_insns. */
> +/* { dg-final { scan-tree-dump "unswitching loop" "unswitch" } } */
> +/* { dg-final { scan-tree-dump-not "unswitching outer loop" "unswitch" } } */
> diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
> index 96680bc64ea..644753dd7a8 100644
> --- a/gcc/tree-ssa-loop-unswitch.cc
> +++ b/gcc/tree-ssa-loop-unswitch.cc
> @@ -250,6 +250,21 @@ set_predicates_for_bb (basic_block bb,
> vec<unswitch_predicate *> predicates)
> bb_predicates->safe_push (predicates);
> }
>
> +/* Estimate number of instructions in LOOP using eni_size_weights. */
> +
> +static unsigned
> +estimate_loop_insns (class loop *loop)
> +{
> + unsigned insns = 0;
> + basic_block *body = get_loop_body (loop);
> + for (unsigned i = 0; i < loop->num_nodes; i++)
> + for (gimple_stmt_iterator gsi = gsi_start_bb (body[i]);
> + !gsi_end_p (gsi); gsi_next (&gsi))
> + insns += estimate_num_insns (gsi_stmt (gsi), &eni_size_weights);
> + free (body);
> + return insns;
> +}
> +
> /* Initialize LOOP information reused during the unswitching pass.
> Return total number of instructions in the loop. Adjusts LOOP to
> the outermost loop all candidates are invariant in. */
> @@ -262,13 +277,35 @@ init_loop_unswitch_info (class loop *&loop,
> unswitch_predicate *&hottest,
>
> basic_block *bbs = get_loop_body (loop);
>
> - /* Unswitch only nests with no sibling loops. */
> + /* Unswitch only nests with no sibling loops. Since predicates come from
> + the innermost loop, only the outer-loop bodies get duplicated by
> hoisting
> + the unswitching out; the innermost loop is shared in the cost estimate.
> */
> class loop *outer_loop = loop;
> unsigned max_depth = param_max_unswitch_depth;
> + unsigned innermost_size = estimate_loop_insns (loop);
> while (loop_outer (outer_loop)->num != 0
> - && !loop_outer (outer_loop)->inner->next
> - && --max_depth != 0)
> - outer_loop = loop_outer (outer_loop);
> + && !loop_outer (outer_loop)->inner->next)
> + {
> + if (--max_depth == 0)
> + break;
> +
> + class loop *candidate = loop_outer (outer_loop);
> + unsigned candidate_size = estimate_loop_insns (candidate);
> +
> + if (candidate_size - innermost_size
> + > (unsigned) param_max_unswitch_insns)
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, find_loop_location (loop),
> + "Not unswitching outer loop, duplicated size %u "
> + "exceeds max-unswitch-insns %u\n",
> + candidate_size - innermost_size,
> + (unsigned) param_max_unswitch_insns);
> + break;
> + }
> +
> + outer_loop = candidate;
> + }
> hottest = NULL;
> hottest_bb = NULL;
> /* Find all unswitching candidates in the innermost loop. */
> @@ -1678,4 +1715,3 @@ make_pass_tree_unswitch (gcc::context *ctxt)
> return new pass_tree_unswitch (ctxt);
> }
>
> -
> --
> 2.34.1
>