Thank you for your review. Richard Biener <[email protected]> 于2026年5月15日周五 16:04写道:
> On Tue, 12 May 2026, Xin Wang wrote: > > > When split_loop does iteration space splitting, split_at_bb_p may > > swap the guard condition so that operand 0 is always the loop IV > > and operand 1 is the invariant. For example, "t < i" (LT_EXPR) > > becomes "i > t" (GT_EXPR). This can cause initial_true to be > > false, meaning loop1 handles iterations where the guard is false > > and loop2 handles iterations where the guard is true. > > > > The profile update code used true_edge->probability for loop1 and > > its inverse for loop2. That is correct only when loop1 keeps the > > true branch. When initial_true is false, loop1 keeps the false > > branch and loop2 keeps the true branch, so all profile quantities > > must follow those semantic edges instead of the raw true/false edge > > names. > > > > Derive loop1_edge and loop2_edge once from initial_true and use them > > consistently for loop_version's then probability, the split-loop BB > > count scaling, and the iteration estimate scaling. Also make > > fix_loop_bb_probability take loop1/loop2 edges explicitly, rather > > than assuming its arguments are true/false edges. > > > > The bug caused BB counts in the split loops to be swapped when > > initial_true is false: the loop body whose guard is forced false > > (loop1, executing fewer iterations) would get the higher profile > > count, and vice versa. It could also leave the precondition edge > > probabilities and iteration estimates based on the wrong split edge. > > OK - much better to follow after the renaming. > > Thanks, > Richard. > > > gcc/ChangeLog: > > > > * tree-ssa-loop-split.cc (fix_loop_bb_probability): Rename > > parameters from true_edge/false_edge to loop1_edge/loop2_edge > > and scale both loops directly from their semantic edge > > probabilities. > > (split_loop): Derive loop1_edge and loop2_edge from initial_true. > > Use them for loop1_prob, fix_loop_bb_probability, and iteration > > estimate scaling. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-prof/loop-split-4.c: New test. > > > > Signed-off-by: Xin Wang <[email protected]> > > --- > > Changes in v2: > > - loop1_prob had the same true/false edge mismatch when initial_true > > is false. > > - Derive loop1_edge and loop2_edge once and use them consistently for > > loop1_prob, fix_loop_bb_probability and iteration estimate scaling. > > - Rename fix_loop_bb_probability parameters to reflect loop1/loop2 > > semantics rather than raw true/false edge semantics. > > - Extend the testcase to check the initial_true=false path, the corrected > > loop counts, and the high-probability skip edge to loop2. > > > > gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 38 +++++++++++++ > > gcc/tree-ssa-loop-split.cc | 54 ++++++++++++------- > > 2 files changed, 72 insertions(+), 20 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > > > diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > new file mode 100644 > > index 00000000000..1c67e30f4de > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c > > @@ -0,0 +1,38 @@ > > +/* PR tree-optimization/XXXXX */ > > +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */ > > + > > +volatile int sink; > > + > > +__attribute__((noinline)) int > > +helper (int a, int b) > > +{ > > + return a + b; > > +} > > + > > +int > > +main (void) > > +{ > > + int n = 100, t = 3, total = 0; > > + /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4 > iterations, > > + empty else) and true for i=4..99 (96 iterations, calls helper). > > + split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving > > + initial_true = false. Loop1 (cold, 4 iterations) handles the false > > + case, loop2 (hot, 96 iterations) handles the true case. > > + Without the fix loop1 was scaled by true_edge->probability (96%), > > + inverting the counts. */ > > + for (int i = 0; i < n; i++) > > + if (t < i) > > + total += helper (i, t); > > + sink = total; > > + return 0; > > +} > > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1 > "lsplit" } } */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0 > "lsplit" } } */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump "Split loop: > initial_true false" "lsplit" } } */ > > +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96 > > + iterations) count ~96. Without the fix the counts are inverted. > > + Check loop1 is a single-digit and loop2 is 90+. */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\], > loop2 count 9\[0-9\]" "lsplit" } } */ > > +/* The skip edge from loop1 to loop2 should use loop1_prob.invert (), > > + where loop1_prob corresponds to the false edge when initial_true is > false. */ > > +/* { dg-final-use-not-autofdo { scan-tree-dump "goto <bb \[0-9\]+>; > .99\\." "lsplit" } } */ > > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc > > index ba6cc45d7f0..477f09bf73d 100644 > > --- a/gcc/tree-ssa-loop-split.cc > > +++ b/gcc/tree-ssa-loop-split.cc > > @@ -524,37 +524,38 @@ compute_new_first_bound (gimple_seq *stmts, class > tree_niter_desc *niter, > > } > > > > /* Fix the two loop's bb count after split based on the split edge > probability, > > - don't adjust the bbs dominated by true branches of that loop to avoid > > + don't adjust the bbs dominated by the branch kept in that loop to > avoid > > dropping 1s down. */ > > static void > > -fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge > true_edge, > > - edge false_edge) > > +fix_loop_bb_probability (class loop *loop1, class loop *loop2, edge > loop1_edge, > > + edge loop2_edge) > > { > > - /* Proportion first loop's bb counts except those dominated by true > > - branch to avoid drop 1s down. */ > > + /* Proportion first loop's bb counts except those dominated by the > > + branch that stays true/false in the first loop, to avoid dropping > > + 1s down. */ > > basic_block *bbs1, *bbs2; > > bbs1 = get_loop_body (loop1); > > unsigned j; > > for (j = 0; j < loop1->num_nodes; j++) > > if (bbs1[j] == loop1->latch > > - /* Watch for case where the true conditional is empty. */ > > - || !single_pred_p (true_edge->dest) > > - || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) > > + /* Watch for case where the kept conditional arm is empty. */ > > + || !single_pred_p (loop1_edge->dest) > > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], loop1_edge->dest)) > > bbs1[j]->count > > - = bbs1[j]->count.apply_probability (true_edge->probability); > > + = bbs1[j]->count.apply_probability (loop1_edge->probability); > > free (bbs1); > > > > - /* Proportion second loop's bb counts except those dominated by false > > - branch to avoid drop 1s down. */ > > - basic_block bbi_copy = get_bb_copy (false_edge->dest); > > + /* Proportion second loop's bb counts except those dominated by the > > + opposite branch, which is kept in the second loop. */ > > + basic_block bbi_copy = get_bb_copy (loop2_edge->dest); > > bbs2 = get_loop_body (loop2); > > for (j = 0; j < loop2->num_nodes; j++) > > if (bbs2[j] == loop2->latch > > - /* Watch for case where the flase conditional is empty. */ > > + /* Watch for case where the kept conditional arm is empty. */ > > || !single_pred_p (bbi_copy) > > || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > > bbs2[j]->count > > - = bbs2[j]->count.apply_probability (true_edge->probability.invert > ()); > > + = bbs2[j]->count.apply_probability (loop2_edge->probability); > > free (bbs2); > > } > > > > @@ -678,6 +679,8 @@ split_loop (class loop *loop1) > > > > edge true_edge, false_edge; > > extract_true_false_edges_from_block (bbs[i], &true_edge, > &false_edge); > > + edge loop1_edge = initial_true ? true_edge : false_edge; > > + edge loop2_edge = initial_true ? false_edge : true_edge; > > > > /* Now version the loop, placing loop2 after loop1 connecting > > them, and fix up SSA form for that. */ > > @@ -686,7 +689,7 @@ split_loop (class loop *loop1) > > > > profile_probability loop1_prob > > = integer_onep (cond) ? profile_probability::always () > > - : true_edge->probability; > > + : loop1_edge->probability; > > /* TODO: It is commonly a case that we know that both loops will be > > entered. very_likely below is the probability that second loop > will > > be entered given by connect_loops. We should work out the > common > > @@ -712,7 +715,16 @@ split_loop (class loop *loop1) > > (loop_preheader_edge (loop2)->src)->probability > > = loop1_prob.invert (); > > > > - fix_loop_bb_probability (loop1, loop2, true_edge, false_edge); > > + fix_loop_bb_probability (loop1, loop2, loop1_edge, loop2_edge); > > + > > + if (dump_file && (dump_flags & TDF_DETAILS)) > > + fprintf (dump_file, > > + ";; Split loop: initial_true %s, " > > + "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n", > > + initial_true ? "true" : "false", > > + (int64_t) loop1->header->count.to_gcov_type (), > > + (int64_t) loop2->header->count.to_gcov_type ()); > > + > > /* If conditional we split on has reliable profilea nd both > > preconditionals of loop1 and loop2 are constant true, we can > > only redistribute the iteration counts to the split loops. > > @@ -730,10 +742,12 @@ split_loop (class loop *loop1) > > if (loop1->any_estimate > > && wi::fits_shwi_p (loop1->nb_iterations_estimate)) > > { > > - sreal scale = true_edge->probability.reliable_p () > > - ? true_edge->probability.to_sreal () : (sreal)1; > > - sreal scale2 = false_edge->probability.reliable_p () > > - ? false_edge->probability.to_sreal () : (sreal)1; > > + sreal scale > > + = loop1_edge->probability.reliable_p () > > + ? loop1_edge->probability.to_sreal () : (sreal)1; > > + sreal scale2 > > + = loop2_edge->probability.reliable_p () > > + ? loop2_edge->probability.to_sreal () : (sreal)1; > > sreal div1 = loop1_prob.initialized_p () > > ? loop1_prob.to_sreal () : (sreal)1/(sreal)2; > > /* +1 to get header interations rather than latch iterations > and then > > > > -- > Richard Biener <[email protected]> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg) >
