On Sat, 9 May 2026, Xin Wang wrote:
> From: Xin Wang <[email protected]>
>
> When split_loop does iteration space splitting, split_at_bb_p may
> swap the guard condition so that operand 0 is always the loop IV
> and operand 1 is the invariant. For example, "t < i" (LT_EXPR)
> becomes "i > t" (GT_EXPR). This can cause initial_true to be
> false, meaning loop1 handles iterations where the guard is false
> and loop2 handles iterations where the guard is true.
>
> The function fix_loop_bb_probability scales loop1's body by
> true_edge->probability and loop2's body by its inverse. But when
> initial_true is false, loop1's body executes the false edge path.
> Loop1 should be scaled by false_edge->probability instead.
>
> This inconsistency is visible in the guard patching code a few
> lines below, which does swap force_true/force_false based on
> initial_true. The profile scaling should apply the same logic.
>
> The bug caused BB counts in the split loops to be swapped when
> initial_true is false: the loop body whose guard is forced false
> (loop1, executing fewer iterations) would get the higher profile
> count, and vice versa.
Does
profile_probability loop1_prob
= integer_onep (cond) ? profile_probability::always ()
: true_edge->probability;
^^^^^^^^^^
have the same problem?
> gcc/ChangeLog:
>
> * tree-ssa-loop-split.cc (split_loop): Pass edges to
> fix_loop_bb_probability with consideration of initial_true,
> so that loop1 is always scaled by the edge probability
> corresponding to the branch that actually executes in it.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-prof/loop-split-4.c: New test.
>
> Signed-off-by: Xin Wang <[email protected]>
> ---
> gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 34 +++++++++++++++++++
> gcc/tree-ssa-loop-split.cc | 13 ++++++-
> 2 files changed, 46 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> new file mode 100644
> index 00000000000..7e0aa883276
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> @@ -0,0 +1,34 @@
> +/* PR tree-optimization/XXXXX */
> +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */
> +
> +volatile int sink;
> +
> +__attribute__((noinline)) int
> +helper (int a, int b)
> +{
> + return a + b;
> +}
> +
> +int
> +main (void)
> +{
> + int n = 100, t = 3, total = 0;
> + /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4 iterations,
> + empty else) and true for i=4..99 (96 iterations, calls helper).
> + split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving
> + initial_true = false. Loop1 (cold, 4 iterations) handles the false
> + case, loop2 (hot, 96 iterations) handles the true case.
> + Without the fix loop1 was scaled by true_edge->probability (96%),
> + inverting the counts. */
> + for (int i = 0; i < n; i++)
> + if (t < i)
> + total += helper (i, t);
> + sink = total;
> + return 0;
> +}
> +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1 "lsplit"
> } } */
> +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0
> "lsplit" } } */
> +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96
> + iterations) count ~96. Without the fix the counts are inverted.
> + Check loop1 is a single-digit and loop2 is 90+. */
> +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\], loop2
> count 9\[0-9\]" "lsplit" } } */
> diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> index ba6cc45d7f0..8aa6275695b 100644
> --- a/gcc/tree-ssa-loop-split.cc
> +++ b/gcc/tree-ssa-loop-split.cc
> @@ -712,7 +712,18 @@ split_loop (class loop *loop1)
> (loop_preheader_edge (loop2)->src)->probability
> = loop1_prob.invert ();
>
> - fix_loop_bb_probability (loop1, loop2, true_edge, false_edge);
> + fix_loop_bb_probability (loop1, loop2,
> + initial_true ? true_edge : false_edge,
> + initial_true ? false_edge : true_edge);
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file,
> + ";; Split loop: initial_true %s, "
> + "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n",
> + initial_true ? "true" : "false",
> + (int64_t) loop1->header->count.to_gcov_type (),
> + (int64_t) loop2->header->count.to_gcov_type ());
> +
> /* If conditional we split on has reliable profilea nd both
> preconditionals of loop1 and loop2 are constant true, we can
> only redistribute the iteration counts to the split loops.
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)