On Tue, 12 May 2026, Xin Wang wrote: > When split_loop does iteration space splitting, split_at_bb_p may > swap the guard condition so that operand 0 is always the loop IV > and operand 1 is the invariant. For example, "t < i" (LT_EXPR) > becomes "i > t" (GT_EXPR). This can cause initial_true to be > false, meaning loop1 handles iterations where the guard is false > and loop2 handles iterations where the guard is true. > > The profile update code used true_edge->probability for loop1 and > its inverse for loop2. That is correct only when loop1 keeps the > true branch. When initial_true is false, loop1 keeps the false > branch and loop2 keeps the true branch, so all profile quantities > must follow those semantic edges instead of the raw true/false edge > names. > > Derive loop1_edge and loop2_edge once from initial_true and use them > consistently for loop_version's then probability, the split-loop BB > count scaling, and the iteration estimate scaling. Also make > fix_loop_bb_probability take loop1/loop2 edges explicitly, rather > than assuming its arguments are true/false edges. > > The bug caused BB counts in the split loops to be swapped when > initial_true is false: the loop body whose guard is forced false > (loop1, executing fewer iterations) would get the higher profile > count, and vice versa. It could also leave the precondition edge > probabilities and iteration estimates based on the wrong split edge.
OK - much better to follow after the renaming. Thanks, Richard. > gcc/ChangeLog: > > * tree-ssa-loop-split.cc (fix_loop_bb_probability): Rename > parameters from true_edge/false_edge to loop1_edge/loop2_edge > and scale both loops directly from their semantic edge > probabilities. > (split_loop): Derive loop1_edge and loop2_edge from initial_true. > Use them for loop1_prob, fix_loop_bb_probability, and iteration > estimate scaling. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-prof/loop-split-4.c: New test. > > Signed-off-by: Xin Wang <[email protected]> > --- > Changes in v2: > - loop1_prob had the same true/false edge mismatch when initial_true > is false. > - Derive loop1_edge and loop2_edge once and use them consistently for > loop1_prob, fix_loop_bb_probability and iteration estimate scaling. > - Rename fix_loop_bb_probability parameters to reflect loop1/loop2 > semantics rather than raw true/false edge semantics. > - Extend the testcase to check the initial_true=false path, the corrected > loop counts, and the high-probability skip edge to loop2. > > gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 38 +++++++++++++ > gcc/tree-ssa-loop-split.cc | 54 ++++++++++++------- > 2 files changed, 72 insertions(+), 20 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > new file mode 100644 > index 00000000000..1c67e30f4de > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > @@ -0,0 +1,38 @@ > +/* PR tree-optimization/XXXXX */ > +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */ > + > +volatile int sink; > + > +__attribute__((noinline)) int > +helper (int a, int b) > +{ > + return a + b; > +} > + > +int > +main (void) > +{ > + int n = 100, t = 3, total = 0; > + /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4 iterations, > + empty else) and true for i=4..99 (96 iterations, calls helper). > + split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving > + initial_true = false. Loop1 (cold, 4 iterations) handles the false > + case, loop2 (hot, 96 iterations) handles the true case. > + Without the fix loop1 was scaled by true_edge->probability (96%), > + inverting the counts. */ > + for (int i = 0; i < n; i++) > + if (t < i) > + total += helper (i, t); > + sink = total; > + return 0; > +} > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1 "lsplit" > } } */ > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0 > "lsplit" } } */ > +/* { dg-final-use-not-autofdo { scan-tree-dump "Split loop: initial_true > false" "lsplit" } } */ > +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96 > + iterations) count ~96. Without the fix the counts are inverted. > + Check loop1 is a single-digit and loop2 is 90+. */ > +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\], loop2 > count 9\[0-9\]" "lsplit" } } */ > +/* The skip edge from loop1 to loop2 should use loop1_prob.invert (), > + where loop1_prob corresponds to the false edge when initial_true is > false. */ > +/* { dg-final-use-not-autofdo { scan-tree-dump "goto <bb \[0-9\]+>; .99\\." > "lsplit" } } */ > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc > index ba6cc45d7f0..477f09bf73d 100644 > --- a/gcc/tree-ssa-loop-split.cc > +++ b/gcc/tree-ssa-loop-split.cc > @@ -524,37 +524,38 @@ compute_new_first_bound (gimple_seq *stmts, class > tree_niter_desc *niter, > } > > /* Fix the two loop's bb count after split based on the split edge > probability, > - don't adjust the bbs dominated by true branches of that loop to avoid > + don't adjust the bbs dominated by the branch kept in that loop to avoid > dropping 1s down. */ > static void > -fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge > true_edge, > - edge false_edge) > +fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge > loop1_edge, > + edge loop2_edge) > { > - /* Proportion first loop's bb counts except those dominated by true > - branch to avoid drop 1s down. */ > + /* Proportion first loop's bb counts except those dominated by the > + branch that stays true/false in the first loop, to avoid dropping > + 1s down. */ > basic_block *bbs1, *bbs2; > bbs1 = get_loop_body (loop1); > unsigned j; > for (j = 0; j < loop1->num_nodes; j++) > if (bbs1[j] == loop1->latch > - /* Watch for case where the true conditional is empty. */ > - || !single_pred_p (true_edge->dest) > - || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) > + /* Watch for case where the kept conditional arm is empty. */ > + || !single_pred_p (loop1_edge->dest) > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], loop1_edge->dest)) > bbs1[j]->count > - = bbs1[j]->count.apply_probability (true_edge->probability); > + = bbs1[j]->count.apply_probability (loop1_edge->probability); > free (bbs1); > > - /* Proportion second loop's bb counts except those dominated by false > - branch to avoid drop 1s down. */ > - basic_block bbi_copy = get_bb_copy (false_edge->dest); > + /* Proportion second loop's bb counts except those dominated by the > + opposite branch, which is kept in the second loop. */ > + basic_block bbi_copy = get_bb_copy (loop2_edge->dest); > bbs2 = get_loop_body (loop2); > for (j = 0; j < loop2->num_nodes; j++) > if (bbs2[j] == loop2->latch > - /* Watch for case where the flase conditional is empty. */ > + /* Watch for case where the kept conditional arm is empty. */ > || !single_pred_p (bbi_copy) > || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > bbs2[j]->count > - = bbs2[j]->count.apply_probability (true_edge->probability.invert ()); > + = bbs2[j]->count.apply_probability (loop2_edge->probability); > free (bbs2); > } > > @@ -678,6 +679,8 @@ split_loop (class loop *loop1) > > edge true_edge, false_edge; > extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge); > + edge loop1_edge = initial_true ? true_edge : false_edge; > + edge loop2_edge = initial_true ? false_edge : true_edge; > > /* Now version the loop, placing loop2 after loop1 connecting > them, and fix up SSA form for that. */ > @@ -686,7 +689,7 @@ split_loop (class loop *loop1) > > profile_probability loop1_prob > = integer_onep (cond) ? profile_probability::always () > - : true_edge->probability; > + : loop1_edge->probability; > /* TODO: It is commonly a case that we know that both loops will be > entered. very_likely below is the probability that second loop will > be entered given by connect_loops. We should work out the common > @@ -712,7 +715,16 @@ split_loop (class loop *loop1) > (loop_preheader_edge (loop2)->src)->probability > = loop1_prob.invert (); > > - fix_loop_bb_probability (loop1, loop2, true_edge, false_edge); > + fix_loop_bb_probability (loop1, loop2, loop1_edge, loop2_edge); > + > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + ";; Split loop: initial_true %s, " > + "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n", > + initial_true ? "true" : "false", > + (int64_t) loop1->header->count.to_gcov_type (), > + (int64_t) loop2->header->count.to_gcov_type ()); > + > /* If conditional we split on has reliable profilea nd both > preconditionals of loop1 and loop2 are constant true, we can > only redistribute the iteration counts to the split loops. > @@ -730,10 +742,12 @@ split_loop (class loop *loop1) > if (loop1->any_estimate > && wi::fits_shwi_p (loop1->nb_iterations_estimate)) > { > - sreal scale = true_edge->probability.reliable_p () > - ? true_edge->probability.to_sreal () : (sreal)1; > - sreal scale2 = false_edge->probability.reliable_p () > - ? false_edge->probability.to_sreal () : (sreal)1; > + sreal scale > + = loop1_edge->probability.reliable_p () > + ? loop1_edge->probability.to_sreal () : (sreal)1; > + sreal scale2 > + = loop2_edge->probability.reliable_p () > + ? loop2_edge->probability.to_sreal () : (sreal)1; > sreal div1 = loop1_prob.initialized_p () > ? loop1_prob.to_sreal () : (sreal)1/(sreal)2; > /* +1 to get header interations rather than latch iterations and > then > -- Richard Biener <[email protected]> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
