Yes, thanks for pointing that out. I checked the rest of the code. I'll send out the second version right away.
Richard Biener <[email protected]> 于2026年5月12日周二 17:37写道: > On Sat, 9 May 2026, Xin Wang wrote: > > > From: Xin Wang <[email protected]> > > > > When split_loop does iteration space splitting, split_at_bb_p may > > swap the guard condition so that operand 0 is always the loop IV > > and operand 1 is the invariant. For example, "t < i" (LT_EXPR) > > becomes "i > t" (GT_EXPR). This can cause initial_true to be > > false, meaning loop1 handles iterations where the guard is false > > and loop2 handles iterations where the guard is true. > > > > The function fix_loop_bb_probability scales loop1's body by > > true_edge->probability and loop2's body by its inverse. But when > > initial_true is false, loop1's body executes the false edge path. > > Loop1 should be scaled by false_edge->probability instead. > > > > This inconsistency is visible in the guard patching code a few > > lines below, which does swap force_true/force_false based on > > initial_true. The profile scaling should apply the same logic. > > > > The bug caused BB counts in the split loops to be swapped when > > initial_true is false: the loop body whose guard is forced false > > (loop1, executing fewer iterations) would get the higher profile > > count, and vice versa. > > Does > > profile_probability loop1_prob > = integer_onep (cond) ? profile_probability::always () > : true_edge->probability; > ^^^^^^^^^^ > > have the same problem? > > > gcc/ChangeLog: > > > > * tree-ssa-loop-split.cc (split_loop): Pass edges to > > fix_loop_bb_probability with consideration of initial_true, > > so that loop1 is always scaled by the edge probability > > corresponding to the branch that actually executes in it. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-prof/loop-split-4.c: New test. > > > > Signed-off-by: Xin Wang <[email protected]> > > --- > > gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 34 +++++++++++++++++++ > > gcc/tree-ssa-loop-split.cc | 13 ++++++- > > 2 files changed, 46 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > > > diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > new file mode 100644 > > index 00000000000..7e0aa883276 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > @@ -0,0 +1,34 @@ > > +/* PR tree-optimization/XXXXX */ > > +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */ > > + > > +volatile int sink; > > + > > +__attribute__((noinline)) int > > +helper (int a, int b) > > +{ > > + return a + b; > > +} > > + > > +int > > +main (void) > > +{ > > + int n = 100, t = 3, total = 0; > > + /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4 > iterations, > > + empty else) and true for i=4..99 (96 iterations, calls helper). > > + split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving > > + initial_true = false. Loop1 (cold, 4 iterations) handles the false > > + case, loop2 (hot, 96 iterations) handles the true case. > > + Without the fix loop1 was scaled by true_edge->probability (96%), > > + inverting the counts. */ > > + for (int i = 0; i < n; i++) > > + if (t < i) > > + total += helper (i, t); > > + sink = total; > > + return 0; > > +} > > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1 > "lsplit" } } */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0 > "lsplit" } } */ > > +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96 > > + iterations) count ~96. Without the fix the counts are inverted. > > + Check loop1 is a single-digit and loop2 is 90+. */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\], > loop2 count 9\[0-9\]" "lsplit" } } */ > > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc > > index ba6cc45d7f0..8aa6275695b 100644 > > --- a/gcc/tree-ssa-loop-split.cc > > +++ b/gcc/tree-ssa-loop-split.cc > > @@ -712,7 +712,18 @@ split_loop (class loop *loop1) > > (loop_preheader_edge (loop2)->src)->probability > > = loop1_prob.invert (); > > > > - fix_loop_bb_probability (loop1, loop2, true_edge, false_edge); > > + fix_loop_bb_probability (loop1, loop2, > > + initial_true ? true_edge : false_edge, > > + initial_true ? false_edge : true_edge); > > + > > + if (dump_file && (dump_flags & TDF_DETAILS)) > > + fprintf (dump_file, > > + ";; Split loop: initial_true %s, " > > + "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n", > > + initial_true ? "true" : "false", > > + (int64_t) loop1->header->count.to_gcov_type (), > > + (int64_t) loop2->header->count.to_gcov_type ()); > > + > > /* If conditional we split on has reliable profilea nd both > > preconditionals of loop1 and loop2 are constant true, we can > > only redistribute the iteration counts to the split loops. > > > > -- > Richard Biener <[email protected]> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg) >
