On Sat, 9 May 2026, Xin Wang wrote:

> From: Xin Wang <[email protected]>
> 
> When split_loop does iteration space splitting, split_at_bb_p may
> swap the guard condition so that operand 0 is always the loop IV
> and operand 1 is the invariant.  For example, "t < i" (LT_EXPR)
> becomes "i > t" (GT_EXPR).  This can cause initial_true to be
> false, meaning loop1 handles iterations where the guard is false
> and loop2 handles iterations where the guard is true.
> 
> The function fix_loop_bb_probability scales loop1's body by
> true_edge->probability and loop2's body by its inverse.  But when
> initial_true is false, loop1's body executes the false edge path.
> Loop1 should be scaled by false_edge->probability instead.
> 
> This inconsistency is visible in the guard patching code a few
> lines below, which does swap force_true/force_false based on
> initial_true.  The profile scaling should apply the same logic.
> 
> The bug caused BB counts in the split loops to be swapped when
> initial_true is false: the loop body whose guard is forced false
> (loop1, executing fewer iterations) would get the higher profile
> count, and vice versa.

Does

        profile_probability loop1_prob
          = integer_onep (cond) ? profile_probability::always ()
                                : true_edge->probability;
                                 ^^^^^^^^^^

have the same problem?

> gcc/ChangeLog:
> 
>         * tree-ssa-loop-split.cc (split_loop): Pass edges to
>         fix_loop_bb_probability with consideration of initial_true,
>         so that loop1 is always scaled by the edge probability
>         corresponding to the branch that actually executes in it.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.dg/tree-prof/loop-split-4.c: New test.
> 
> Signed-off-by: Xin Wang <[email protected]>
> ---
>  gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c | 34 +++++++++++++++++++
>  gcc/tree-ssa-loop-split.cc                    | 13 ++++++-
>  2 files changed, 46 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c 
> b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> new file mode 100644
> index 00000000000..7e0aa883276
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/loop-split-4.c
> @@ -0,0 +1,34 @@
> +/* PR tree-optimization/XXXXX */
> +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */
> +
> +volatile int sink;
> +
> +__attribute__((noinline)) int
> +helper (int a, int b)
> +{
> +  return a + b;
> +}
> +
> +int
> +main (void)
> +{
> +  int n = 100, t = 3, total = 0;
> +  /* With t=3, n=100 the guard "t < i" is false for i=0..3 (4 iterations,
> +     empty else) and true for i=4..99 (96 iterations, calls helper).
> +     split_at_bb_p swaps "t < i" to "i > t" (GT_EXPR), giving
> +     initial_true = false.  Loop1 (cold, 4 iterations) handles the false
> +     case, loop2 (hot, 96 iterations) handles the true case.
> +     Without the fix loop1 was scaled by true_edge->probability (96%),
> +     inverting the counts.  */
> +  for (int i = 0; i < n; i++)
> +    if (t < i)
> +      total += helper (i, t);
> +  sink = total;
> +  return 0;
> +}
> +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Loop split" 1 "lsplit" 
> } } */
> +/* { dg-final-use-not-autofdo { scan-tree-dump-times "Invalid sum" 0 
> "lsplit" } } */
> +/* With the fix loop1 (cold, 4 iterations) count ~4, loop2 (hot, 96
> +   iterations) count ~96.  Without the fix the counts are inverted.
> +   Check loop1 is a single-digit and loop2 is 90+.  */
> +/* { dg-final-use-not-autofdo { scan-tree-dump "loop1 count \[0-9\], loop2 
> count 9\[0-9\]" "lsplit" } } */
> diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> index ba6cc45d7f0..8aa6275695b 100644
> --- a/gcc/tree-ssa-loop-split.cc
> +++ b/gcc/tree-ssa-loop-split.cc
> @@ -712,7 +712,18 @@ split_loop (class loop *loop1)
>               (loop_preheader_edge (loop2)->src)->probability
>                       = loop1_prob.invert ();
>  
> -     fix_loop_bb_probability (loop1, loop2, true_edge, false_edge);
> +     fix_loop_bb_probability (loop1, loop2,
> +                              initial_true ? true_edge : false_edge,
> +                              initial_true ? false_edge : true_edge);
> +
> +     if (dump_file && (dump_flags & TDF_DETAILS))
> +       fprintf (dump_file,
> +                ";; Split loop: initial_true %s, "
> +                "loop1 count %" PRId64 ", loop2 count %" PRId64 "\n",
> +                initial_true ? "true" : "false",
> +                (int64_t) loop1->header->count.to_gcov_type (),
> +                (int64_t) loop2->header->count.to_gcov_type ());
> +
>       /* If conditional we split on has reliable profilea nd both
>          preconditionals of loop1 and loop2 are constant true, we can
>          only redistribute the iteration counts to the split loops.
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to