Re: [PATCH v2] tree-optimization: Fix profile update in loop splitting (initial_true=false)

ywgrit Tue, 19 May 2026 19:07:32 -0700

Thank you for your review.

Richard Biener <[email protected]> 于2026年5月15日周五 16:04写道：


> On Tue, 12 May 2026, Xin Wang wrote:
>
> > When split_loop does iteration space splitting, split_at_bb_p may
> > swap the guard condition so that operand 0 is always the loop IV
> > and operand 1 is the invariant.  For example, "t < i" (LT_EXPR)
> > becomes "i > t" (GT_EXPR).  This can cause initial_true to be
> > false, meaning loop1 handles iterations where the guard is false
> > and loop2 handles iterations where the guard is true.
> >
> > The profile update code used true_edge->probability for loop1 and
> > its inverse for loop2.  That is correct only when loop1 keeps the
> > true branch.  When initial_true is false, loop1 keeps the false
> > branch and loop2 keeps the true branch, so all profile quantities
> > must follow those semantic edges instead of the raw true/false edge
> > names.
> >
> > Derive loop1_edge and loop2_edge once from initial_true and use them
> > consistently for loop_version's then probability, the split-loop BB
> > count scaling, and the iteration estimate scaling.  Also make
> > fix_loop_bb_probability take loop1/loop2 edges explicitly, rather
> > than assuming its arguments are true/false edges.
> >
> > The bug caused BB counts in the split loops to be swapped when
> > initial_true is false: the loop body whose guard is forced false
> > (loop1, executing fewer iterations) would get the higher profile
> > count, and vice versa.  It could also leave the precondition edge
> > probabilities and iteration estimates based on the wrong split edge.
>
> OK - much better to follow after the renaming.
>
> Thanks,
> Richard.
>
> > gcc/ChangeLog:
> >
> >       * tree-ssa-loop-split.cc (fix_loop_bb_probability): Rename
> >       parameters from true_edge/false_edge to loop1_edge/loop2_edge
> >       and scale both loops directly from their semantic edge
> >       probabilities.
> >       (split_loop): Derive loop1_edge and loop2_edge from initial_true.
> >       Use them for loop1_prob, fix_loop_bb_probability, and iteration
> >       estimate scaling.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.dg/tree-prof/loop-split-4.c: New test.
> >
> > Signed-off-by: Xin Wang <[email protected]>
> > ---
> > Changes in v2:
> > - loop1_prob had the same true/false edge mismatch when initial_true
> >   is false.
> > - Derive loop1_edge and loop2_edge once and use them consistently for
> >   loop1_prob, fix_loop_bb_probability and iteration estimate scaling.
> > - Rename fix_loop_bb_probability parameters to reflect loop1/loop2
> >   semantics rather than raw true/false edge semantics.
> > - Extend the testcase to check the initial_true=false path, the corrected
> >   loop counts, and the high-probability skip edge to loop2.
> >
> >  gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 38 +++++++++++++
> >  gcc/tree-ssa-loop-split.cc                    | 54 ++++++++++++-------
> >  2 files changed, 72 insertions(+), 20 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> > new file mode 100644
> > index 00000000000..1c67e30f4de
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> > @@ -0,0 +1,38 @@
> > +/* PR tree-optimization/XXXXX */
> > +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */
> > +
> > +volatile int sink;
> > +
> > +__attribute__((noinline)) int
> > +helper (int a, int b)
> > +{
> > +  return a + b;
> > +}
> > +
> > +int
> > +main (void)
> > +{
> > +  int n = 100, t = 3, total = 0;
> > +  /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4
> iterations,
> > +     empty else) and true for i=4..99 (96 iterations, calls helper).
> > +     split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving
> > +     initial_true = false.  Loop1 (cold, 4 iterations) handles the false
> > +     case, loop2 (hot, 96 iterations) handles the true case.
> > +     Without the fix loop1 was scaled by true_edge->probability (96%),
> > +     inverting the counts.  */
> > +  for (int i = 0; i < n; i++)
> > +    if (t < i)
> > +      total += helper (i, t);
> > +  sink = total;
> > +  return 0;
> > +}
> > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1
> "lsplit" } } */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0
> "lsplit" } } */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump "Split loop:
> initial_true false" "lsplit" } } */
> > +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96
> > +   iterations) count ~96.  Without the fix the counts are inverted.
> > +   Check loop1 is a single-digit and loop2 is 90+.  */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\],
> loop2 count 9\[0-9\]" "lsplit" } } */
> > +/* The skip edge from loop1 to loop2 should use loop1_prob.invert (),
> > +   where loop1_prob corresponds to the false edge when initial_true is
> false.  */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump "goto <bb \[0-9\]+>;
> .99\\." "lsplit" } } */
> > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> > index ba6cc45d7f0..477f09bf73d 100644
> > --- a/gcc/tree-ssa-loop-split.cc
> > +++ b/gcc/tree-ssa-loop-split.cc
> > @@ -524,37 +524,38 @@ compute_new_first_bound (gimple_seq *stmts, class
> tree_niter_desc *niter,
> >  }
> >
> >  /* Fix the two loop's bb count after split based on the split edge
> probability,
> > -   don't adjust the bbs dominated by true branches of that loop to avoid
> > +   don't adjust the bbs dominated by the branch kept in that loop to
> avoid
> >     dropping 1s down.  */
> >  static void
> > -fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge
> true_edge,
> > -                      edge false_edge)
> > +fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge
> loop1_edge,
> > +                      edge loop2_edge)
> >  {
> > -  /* Proportion first loop's bb counts except those dominated by true
> > -     branch to avoid drop 1s down.  */
> > +  /* Proportion first loop's bb counts except those dominated by the
> > +     branch that stays true/false in the first loop, to avoid dropping
> > +     1s down.  */
> >    basic_block *bbs1, *bbs2;
> >    bbs1 = get_loop_body (loop1);
> >    unsigned j;
> >    for (j = 0; j < loop1->num_nodes; j++)
> >      if (bbs1[j] == loop1->latch
> > -     /* Watch for case where the true conditional is empty.  */
> > -     || !single_pred_p (true_edge->dest)
> > -     || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest))
> > +     /* Watch for case where the kept conditional arm is empty.  */
> > +     || !single_pred_p (loop1_edge->dest)
> > +     || !dominated_by_p (CDI_DOMINATORS, bbs1[j], loop1_edge->dest))
> >        bbs1[j]->count
> > -     = bbs1[j]->count.apply_probability (true_edge->probability);
> > +     = bbs1[j]->count.apply_probability (loop1_edge->probability);
> >    free (bbs1);
> >
> > -  /* Proportion second loop's bb counts except those dominated by false
> > -     branch to avoid drop 1s down.  */
> > -  basic_block bbi_copy = get_bb_copy (false_edge->dest);
> > +  /* Proportion second loop's bb counts except those dominated by the
> > +     opposite branch, which is kept in the second loop.  */
> > +  basic_block bbi_copy = get_bb_copy (loop2_edge->dest);
> >    bbs2 = get_loop_body (loop2);
> >    for (j = 0; j < loop2->num_nodes; j++)
> >      if (bbs2[j] == loop2->latch
> > -     /* Watch for case where the flase conditional is empty.  */
> > +     /* Watch for case where the kept conditional arm is empty.  */
> >       || !single_pred_p (bbi_copy)
> >       || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy))
> >        bbs2[j]->count
> > -     = bbs2[j]->count.apply_probability (true_edge->probability.invert
> ());
> > +     = bbs2[j]->count.apply_probability (loop2_edge->probability);
> >    free (bbs2);
> >  }
> >
> > @@ -678,6 +679,8 @@ split_loop (class loop *loop1)
> >
> >       edge true_edge, false_edge;
> >       extract_true_false_edges_from_block (bbs[i], &true_edge,
> &false_edge);
> > +     edge loop1_edge = initial_true ? true_edge : false_edge;
> > +     edge loop2_edge = initial_true ? false_edge : true_edge;
> >
> >       /* Now version the loop, placing loop2 after loop1 connecting
> >          them, and fix up SSA form for that.  */
> > @@ -686,7 +689,7 @@ split_loop (class loop *loop1)
> >
> >       profile_probability loop1_prob
> >         = integer_onep (cond) ? profile_probability::always ()
> > -                             : true_edge->probability;
> > +                             : loop1_edge->probability;
> >       /* TODO: It is commonly a case that we know that both loops will be
> >          entered.  very_likely below is the probability that second loop
> will
> >          be entered given by connect_loops.  We should work out the
> common
> > @@ -712,7 +715,16 @@ split_loop (class loop *loop1)
> >               (loop_preheader_edge (loop2)->src)->probability
> >                       = loop1_prob.invert ();
> >
> > -     fix_loop_bb_probability (loop1, loop2, true_edge, false_edge);
> > +     fix_loop_bb_probability (loop1, loop2, loop1_edge, loop2_edge);
> > +
> > +     if (dump_file && (dump_flags & TDF_DETAILS))
> > +       fprintf (dump_file,
> > +                ";; Split loop: initial_true %s, "
> > +                "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n",
> > +                initial_true ? "true" : "false",
> > +                (int64_t) loop1->header->count.to_gcov_type (),
> > +                (int64_t) loop2->header->count.to_gcov_type ());
> > +
> >       /* If conditional we split on has reliable profilea nd both
> >          preconditionals of loop1 and loop2 are constant true, we can
> >          only redistribute the iteration counts to the split loops.
> > @@ -730,10 +742,12 @@ split_loop (class loop *loop1)
> >       if (loop1->any_estimate
> >           && wi::fits_shwi_p (loop1->nb_iterations_estimate))
> >         {
> > -         sreal scale = true_edge->probability.reliable_p ()
> > -                       ? true_edge->probability.to_sreal () : (sreal)1;
> > -         sreal scale2 = false_edge->probability.reliable_p ()
> > -                       ? false_edge->probability.to_sreal () : (sreal)1;
> > +         sreal scale
> > +           = loop1_edge->probability.reliable_p ()
> > +             ? loop1_edge->probability.to_sreal () : (sreal)1;
> > +         sreal scale2
> > +           = loop2_edge->probability.reliable_p ()
> > +             ? loop2_edge->probability.to_sreal () : (sreal)1;
> >           sreal div1 = loop1_prob.initialized_p ()
> >                        ? loop1_prob.to_sreal () : (sreal)1/(sreal)2;
> >           /* +1 to get header interations rather than latch iterations
> and then
> >
>
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)
>

Re: [PATCH v2] tree-optimization: Fix profile update in loop splitting (initial_true=false)

Reply via email to