Yes, thanks for pointing that out. I checked the rest of the code. I'll
send out the second version right away.

Richard Biener <[email protected]> 于2026年5月12日周二 17:37写道:

> On Sat, 9 May 2026, Xin Wang wrote:
>
> > From: Xin Wang <[email protected]>
> >
> > When split_loop does iteration space splitting, split_at_bb_p may
> > swap the guard condition so that operand 0 is always the loop IV
> > and operand 1 is the invariant.  For example, "t < i" (LT_EXPR)
> > becomes "i > t" (GT_EXPR).  This can cause initial_true to be
> > false, meaning loop1 handles iterations where the guard is false
> > and loop2 handles iterations where the guard is true.
> >
> > The function fix_loop_bb_probability scales loop1's body by
> > true_edge->probability and loop2's body by its inverse.  But when
> > initial_true is false, loop1's body executes the false edge path.
> > Loop1 should be scaled by false_edge->probability instead.
> >
> > This inconsistency is visible in the guard patching code a few
> > lines below, which does swap force_true/force_false based on
> > initial_true.  The profile scaling should apply the same logic.
> >
> > The bug caused BB counts in the split loops to be swapped when
> > initial_true is false: the loop body whose guard is forced false
> > (loop1, executing fewer iterations) would get the higher profile
> > count, and vice versa.
>
> Does
>
>         profile_probability loop1_prob
>           = integer_onep (cond) ? profile_probability::always ()
>                                 : true_edge->probability;
>                                  ^^^^^^^^^^
>
> have the same problem?
>
> > gcc/ChangeLog:
> >
> >         * tree-ssa-loop-split.cc (split_loop): Pass edges to
> >         fix_loop_bb_probability with consideration of initial_true,
> >         so that loop1 is always scaled by the edge probability
> >         corresponding to the branch that actually executes in it.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-prof/loop-split-4.c: New test.
> >
> > Signed-off-by: Xin Wang <[email protected]>
> > ---
> >  gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 34 +++++++++++++++++++
> >  gcc/tree-ssa-loop-split.cc                    | 13 ++++++-
> >  2 files changed, 46 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> > new file mode 100644
> > index 00000000000..7e0aa883276
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> > @@ -0,0 +1,34 @@
> > +/* PR tree-optimization/XXXXX */
> > +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */
> > +
> > +volatile int sink;
> > +
> > +__attribute__((noinline)) int
> > +helper (int a, int b)
> > +{
> > +  return a + b;
> > +}
> > +
> > +int
> > +main (void)
> > +{
> > +  int n = 100, t = 3, total = 0;
> > +  /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4
> iterations,
> > +     empty else) and true for i=4..99 (96 iterations, calls helper).
> > +     split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving
> > +     initial_true = false.  Loop1 (cold, 4 iterations) handles the false
> > +     case, loop2 (hot, 96 iterations) handles the true case.
> > +     Without the fix loop1 was scaled by true_edge->probability (96%),
> > +     inverting the counts.  */
> > +  for (int i = 0; i < n; i++)
> > +    if (t < i)
> > +      total += helper (i, t);
> > +  sink = total;
> > +  return 0;
> > +}
> > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1
> "lsplit" } } */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0
> "lsplit" } } */
> > +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96
> > +   iterations) count ~96.  Without the fix the counts are inverted.
> > +   Check loop1 is a single-digit and loop2 is 90+.  */
> > +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\],
> loop2 count 9\[0-9\]" "lsplit" } } */
> > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> > index ba6cc45d7f0..8aa6275695b 100644
> > --- a/gcc/tree-ssa-loop-split.cc
> > +++ b/gcc/tree-ssa-loop-split.cc
> > @@ -712,7 +712,18 @@ split_loop (class loop *loop1)
> >               (loop_preheader_edge (loop2)->src)->probability
> >                       = loop1_prob.invert ();
> >
> > -     fix_loop_bb_probability (loop1, loop2, true_edge, false_edge);
> > +     fix_loop_bb_probability (loop1, loop2,
> > +                              initial_true ? true_edge : false_edge,
> > +                              initial_true ? false_edge : true_edge);
> > +
> > +     if (dump_file && (dump_flags & TDF_DETAILS))
> > +       fprintf (dump_file,
> > +                ";; Split loop: initial_true %s, "
> > +                "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n",
> > +                initial_true ? "true" : "false",
> > +                (int64_t) loop1->header->count.to_gcov_type (),
> > +                (int64_t) loop2->header->count.to_gcov_type ());
> > +
> >       /* If conditional we split on has reliable profilea nd both
> >          preconditionals of loop1 and loop2 are constant true, we can
> >          only redistribute the iteration counts to the split loops.
> >
>
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)
>

Reply via email to