Hi all, thanks for the feedback! We've sent a new version https://gcc.gnu.org/pipermail/gcc-patches/2026-March/710794.html.
We went with Richard's suggestion, using `tree_ssa_ifcombine_bb`. The compile-time differences between this and re-running ifcombine are negligible. However, this approach has a higher coverage. Ifcombine (re-run before 2nd sink) manages to optimize only the "noccmp0" function from "pr102793-1.c" in our testcases. Thanks, Konstantinos On Mon, Dec 8, 2025 at 12:37 PM Richard Biener <[email protected]> wrote: > On Fri, 5 Dec 2025, Konstantinos Eleftheriou wrote: > > > If we consider code like: > > > > if (bar1 == x) > > return foo(); > > if (bar2 != y) > > return foo(); > > return 0; > > > > We would like the ifcombine pass to convert this to: > > > > if (bar1 == x || bar2 != y) > > return foo(); > > return 0; > > > > The ifcombine pass can handle this transformation but it is ran very > early and > > it misses the opportunity because there are two seperate blocks for > foo(). > > > > This patch updates the tail-merging pass so that, after the block merge, > it > > attempts to combine conditions in the predecessors of the merged block > in its > > immediate dominator. The optimization is restricted to targets with > support > > for conditional comparisons. > > > > An XFAIL in gcc.dg/guality/pr54693-2.c now triggers only when `-flto` w/o > > `-fno-use-linker-plugin` is used. The generated code and debug info seems > > correct, so we have adjusted the clause. > > I addition to what Andrew said, please collect candidate blocks > in apply_clusters instead of doing the combining there. Like keep > a bitmap of BB indices. Then, in tail_merge_optimize perform > the if-combining in the if (nr_bbs_removed_total > 0) block, > also doing the undef-marking there on-demand. > > That said, I'd have expected you to call tree_ssa_ifcombine_bb, > simply by trying that on the immediate dominator of the merged > tail. Why does this simple approach not work? > > Thanks, > Richard. > > > PR tree-optimization/102793 > > > > gcc/ChangeLog: > > > > * tree-ssa-tail-merge.cc (INCLUDE_MEMORY): Added definition. > > (struct cond_combine_info): New struct. > > (phi_args_def_p): New function. > > (maybe_combine_conditions): New function. > > (apply_clusters): Added call to `maybe_combine_conditions`. > > (tail_merge_optimize): Added call to `mark_ssa_maybe_undefs`. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/guality/pr54693-2.c: Update 'x-fail' clause. > > * gcc.dg/tree-ssa/pr102793-1.c: New test. > > * gcc.dg/tree-ssa/pr102793-2.c: New test. > > > > Signed-off-by: Konstantinos Eleftheriou < > [email protected]> > > --- > > > > gcc/testsuite/gcc.dg/guality/pr54693-2.c | 2 +- > > gcc/testsuite/gcc.dg/tree-ssa/pr102793-1.c | 50 ++++ > > gcc/testsuite/gcc.dg/tree-ssa/pr102793-2.c | 51 ++++ > > gcc/tree-ssa-tail-merge.cc | 258 +++++++++++++++++++++ > > 4 files changed, 360 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102793-1.c > > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102793-2.c > > > > diff --git a/gcc/testsuite/gcc.dg/guality/pr54693-2.c > b/gcc/testsuite/gcc.dg/guality/pr54693-2.c > > index 229ef0efbea0..9d0080782410 100644 > > --- a/gcc/testsuite/gcc.dg/guality/pr54693-2.c > > +++ b/gcc/testsuite/gcc.dg/guality/pr54693-2.c > > @@ -18,7 +18,7 @@ foo (int x, int y, int z) > > while (x > 3 && y > 3 && z > 3) > > { /* { dg-final { gdb-test .+2 "i" "v + 1" } } */ > > /* { dg-final { gdb-test .+1 "x" "10 - i" { xfail { > aarch64*-*-* && { any-opts "-fno-fat-lto-objects" } } } } } */ > > - bar (i); /* { dg-final { gdb-test . "y" "20 - 2 * i" { > xfail { aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */ > > + bar (i); /* { dg-final { gdb-test . "y" "20 - 2 * i" { > xfail { aarch64*-*-* && { { any-opts "-flto" } && { no-opts > "-fno-use-linker-plugin" } } } } } } */ > > /* { dg-final { gdb-test .-1 "z" "30 - 3 * i" { xfail { > aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */ > > i++, x--, y -= 2, z -= 3; > > } > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102793-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr102793-1.c > > new file mode 100644 > > index 000000000000..fc02c92059ba > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102793-1.c > > @@ -0,0 +1,50 @@ > > +/* { dg-do compile } */ > > +/* { dg-skip-if "requires ccmp support" { ! { aarch64*-*-* || apxf } } > } */ > > +/* { dg-options "-O3 -fdump-tree-pre" } */ > > + > > +typedef unsigned long uint64_t; > > + > > +int foo(void); > > + > > +int ccmp(uint64_t* s1, uint64_t* s2) > > +{ > > + uint64_t d1, d2, bar; > > + d1 = *s1++; > > + d2 = *s2++; > > + bar = (d1 ^ d2) & 0xabcd; > > + if (bar == 0 || d1 != d2) > > + return foo(); > > + return 0; > > +} > > + > > +int noccmp0(uint64_t* s1, uint64_t* s2) > > +{ > > + uint64_t d1, d2, bar; > > + > > + d1 = *s1++; > > + d2 = *s2++; > > + bar = (d1 ^ d2) & 0xabcd; > > + if (bar == 0) > > + return foo(); > > + if (d1 != d2) > > + return foo(); > > + return 0; > > +} > > + > > +int noccmp1(uint64_t* s1, uint64_t* s2) > > +{ > > + uint64_t d1, d2, d3, d4, bar; > > + d1 = *s1++; > > + d2 = *s2++; > > + d3 = *s1++; > > + d4 = *s2++; > > + bar = (d1 ^ d2) & 0xabcd; > > + if (bar == 0) > > + return foo(); > > + if (d3 != d4) > > + return foo(); > > + return 0; > > +} > > + > > +/* Check for condition assignments for noccmp0 and noccmp1. */ > > +/* { dg-final { scan-tree-dump-times {_\d+ = bar_\d+ == 0;\n _\d+ = > d\d+_\d+ != d\d+_\d+;} 2 "pre" } } */ > > \ No newline at end of file > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102793-2.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr102793-2.c > > new file mode 100644 > > index 000000000000..6e8674781563 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102793-2.c > > @@ -0,0 +1,51 @@ > > +/* { dg-do compile } */ > > +/* { dg-skip-if "requires ccmp support" { ! { aarch64*-*-* || apxf } } > } */ > > +/* { dg-options "-O3 -fdump-tree-pre" } */ > > + > > +typedef unsigned long uint64_t; > > + > > +int foo(void); > > + > > +uint64_t noccmp0(uint64_t* s1, uint64_t* s2) > > +{ > > + uint64_t d1, d2, d3, d4, bar; > > + d1 = *s1++; > > + d2 = *s2++; > > + d3 = *s1++; > > + d4 = *s2++; > > + bar = (d1 ^ d2) & 0xabcd; > > + if (bar == 0) > > + return foo(); > > + if (d3 != d4) > > + d3++; > > + else > > + return foo(); > > + return d3; > > +} > > + > > +uint64_t noccmp1(uint64_t* s1, uint64_t* s2) > > +{ > > + uint64_t d1, d2, d3, d4, bar; > > + d1 = *s1++; > > + d2 = *s2++; > > + d3 = *s1++; > > + d4 = *s2++; > > + bar = (d1 ^ d2) & 0xabcd; > > + if (bar == 0) > > + d3++; > > + else > > + return foo(); > > + if (d3 > d4) > > + d3++; > > + else if (d1 != d2) > > + return foo (); > > + d3 = d3 + d4 + 1; > > + return d3; > > +} > > + > > +/* Check for condition assignments in the case that the transformation > > + is applied. > > + The transformation should not be applied on noccmp1, where the foo > call is > > + on the false branch of the first condition. */ > > +/* { dg-final { scan-tree-dump-times {_\d+ = bar_\d+ == 0;\n _\d+ = > d\d+_\d+ == d\d+_\d+;} 1 "pre" } } */ > > +/* { dg-final { scan-tree-dump-times {if \(bar_\d+ == 0\)} 1 "pre" } } > */ > > \ No newline at end of file > > diff --git a/gcc/tree-ssa-tail-merge.cc b/gcc/tree-ssa-tail-merge.cc > > index 857e91c206b3..f1f81646a3a4 100644 > > --- a/gcc/tree-ssa-tail-merge.cc > > +++ b/gcc/tree-ssa-tail-merge.cc > > @@ -189,6 +189,7 @@ along with GCC; see the file COPYING3. If not see > > #include "system.h" > > #include "coretypes.h" > > #include "backend.h" > > +#include "target.h" > > #include "tree.h" > > #include "gimple.h" > > #include "cfghooks.h" > > @@ -202,10 +203,17 @@ along with GCC; see the file COPYING3. If not see > > #include "tree-cfg.h" > > #include "tree-into-ssa.h" > > #include "tree-ssa-sccvn.h" > > +#include "tree-ssa-ifcombine.h" > > #include "cfgloop.h" > > #include "tree-eh.h" > > #include "tree-cfgcleanup.h" > > > > +#define INCLUDE_MEMORY > > +#include <memory> > > +#include "gimple-pretty-print.h" > > +#include "tree-ssa.h" > > +#include "algorithm" > > + > > const int ignore_edge_flags = EDGE_DFS_BACK | EDGE_EXECUTABLE; > > > > /* Describes a group of bbs with the same successors. The successor > bbs are > > @@ -277,6 +285,18 @@ struct aux_bb_info > > basic_block dep_bb; > > }; > > > > +/* Info for maybe_combine_conditions. */ > > + > > +struct cond_combine_info > > +{ > > + /* Conditional expression that leads to the merged block. */ > > + gcond *bb_cond_stmt; > > + /* Block that contains the conditional expression. */ > > + basic_block cond_bb; > > + /* True if the merged block resides in the condition's else branch. > */ > > + bool in_else_branch; > > +}; > > + > > /* Macros to access the fields of struct aux_bb_info. */ > > > > #define BB_SIZE(bb) (((struct aux_bb_info *)bb->aux)->size) > > @@ -1707,6 +1727,238 @@ replace_block_by (basic_block bb1, basic_block > bb2) > > > > static bitmap update_bbs; > > > > +static bool phi_args_def_p (basic_block bb, sbitmap visited) > > +{ > > + if (bitmap_bit_p (visited, bb->index)) > > + return true; > > + > > + bitmap_set_bit (visited, bb->index); > > + > > + edge e; > > + edge_iterator ei; > > + FOR_EACH_EDGE (e, ei, bb->succs) > > + { > > + basic_block succ = e->dest; > > + gphi_iterator gpi; > > + > > + for (gpi = gsi_start_phis (succ); !gsi_end_p (gpi); gsi_next > (&gpi)) > > + { > > + gphi *phi = gpi.phi (); > > + use_operand_p use; > > + ssa_op_iter iter; > > + > > + FOR_EACH_PHI_ARG (use, phi, iter, SSA_OP_USE) > > + { > > + tree var = USE_FROM_PTR (use); > > + > > + if (TREE_CODE (var) != SSA_NAME) > > + continue; > > + > > + if (ssa_name_maybe_undef_p (var)) > > + return false; > > + } > > + } > > + > > + if (!phi_args_def_p (succ, visited)) > > + return false; > > + } > > + > > + return true; > > +} > > + > > +/* Try to combine conditions with edges that lead to BB in its immediate > > + dominator. Nothing is done for targets that don't support > conditional > > + comparisons. */ > > + > > +static void > > +maybe_combine_conditions (basic_block bb) > > +{ > > + if (!targetm.have_ccmp ()) > > + return; > > + > > + basic_block imm_dominator = get_immediate_dominator (CDI_DOMINATORS, > bb); > > + > > + edge e; > > + edge_iterator ei; > > + bool bb_in_else_branch = false; > > + auto_vec<struct cond_combine_info> comb_conds (2); > > + > > + /* Check that the immediate dominator is found and has two successors > and > > + the merged block has two predecessors. */ > > + if (!imm_dominator || imm_dominator->succs->length () != 2 > > + || bb->preds->length () != 2) > > + return; > > + > > + /* Find conditions in if-statements that lead to bb. */ > > + int index = 0; > > + int imm_dom_index = -1; > > + unsigned int non_imm_dom_index; > > + FOR_EACH_EDGE (e, ei, bb->preds) > > + { > > + basic_block then_tmp = NULL; > > + basic_block else_tmp = NULL; > > + bb_in_else_branch = recognize_if_then_else (e->src, &then_tmp, > &bb); > > + if (recognize_if_then_else (e->src, &bb, &else_tmp) > > + || bb_in_else_branch) > > + { > > + gcond *cond = safe_dyn_cast <gcond *> (*gsi_last_bb (e->src)); > > + if (TREE_CODE_CLASS (gimple_cond_code (cond)) != tcc_comparison > > + || index == 2) > > + return; > > + > > + struct cond_combine_info cci; > > + cci.bb_cond_stmt = cond; > > + cci.cond_bb = e->src; > > + cci.in_else_branch = bb_in_else_branch; > > + comb_conds.quick_push (cci); > > + > > + if (cci.cond_bb == imm_dominator) > > + imm_dom_index = index; > > + else > > + non_imm_dom_index = index; > > + index++; > > + } > > + } > > + > > + /* Abort if the immediate dominator is not found in the blocks > containing > > + the conditions or if only one block with a condition is found. */ > > + if (imm_dom_index == -1 || comb_conds.length () == 1) > > + return; > > + > > + /* Check if the non-immediate dominator block contains other non-debug > > + statements than the condition and abort in that case. */ > > + unsigned int stmt_count = 0; > > + gimple_stmt_iterator gsi > > + = gsi_start_bb (comb_conds[non_imm_dom_index].cond_bb); > > + for (; !gsi_end_p (gsi) && stmt_count < 2; gsi_next (&gsi)) > > + { > > + if (!is_gimple_debug (gsi_stmt (gsi))) > > + stmt_count++; > > + } > > + > > + bool other_bb_has_only_cond = stmt_count < 2; > > + if (!other_bb_has_only_cond) > > + return; > > + > > + basic_block imm_dom_block = comb_conds[imm_dom_index].cond_bb; > > + basic_block non_imm_dom_block = comb_conds[non_imm_dom_index].cond_bb; > > + > > + /* Ensure that the other block is a successor of the immediate > dominator, > > + that phi arguments from both condition blocks to bb are > > + the same and that phi arguments found starting from the other block > > + are defined on all paths. */ > > + auto_sbitmap visited (last_basic_block_for_fn (cfun)); > > + bitmap_clear (visited); > > + if ((!find_edge (imm_dom_block, non_imm_dom_block)) > > + || !same_phi_args_p (imm_dom_block, non_imm_dom_block, bb) > > + || !phi_args_def_p (non_imm_dom_block, visited)) > > + return; > > + > > + /* Convert condition statements to tree nodes. */ > > + gcond *imm_dom_cond = comb_conds[imm_dom_index].bb_cond_stmt; > > + gcond *non_imm_dom_cond = comb_conds[non_imm_dom_index].bb_cond_stmt; > > + tree bb_cond1_lhs = gimple_cond_lhs (imm_dom_cond); > > + tree bb_cond1_rhs = gimple_cond_rhs (imm_dom_cond); > > + tree bb_cond2_lhs = gimple_cond_lhs (non_imm_dom_cond); > > + tree bb_cond2_rhs = gimple_cond_rhs (non_imm_dom_cond); > > + > > + tree cond1_expr = build2 (gimple_cond_code (imm_dom_cond), > > + TREE_TYPE (bb_cond1_lhs), > > + bb_cond1_lhs, bb_cond1_rhs); > > + > > + tree cond2_expr = build2 (gimple_cond_code (non_imm_dom_cond), > > + TREE_TYPE (bb_cond2_lhs), > > + bb_cond2_lhs, bb_cond2_rhs); > > + > > + /* Fix the conditional expressions, so that the combined expression is > > + correct when the remaining block is reached through the else > branch. */ > > + tree_code bool_op = BIT_IOR_EXPR; > > + bool non_imm_dom_in_else_branch > > + = comb_conds[non_imm_dom_index].in_else_branch; > > + if (comb_conds[imm_dom_index].in_else_branch) > > + bool_op = BIT_AND_EXPR; > > + > > + if ((!non_imm_dom_in_else_branch && bool_op == BIT_AND_EXPR) > > + || (non_imm_dom_in_else_branch && bool_op == BIT_IOR_EXPR)) > > + { > > + cond2_expr = invert_truthvalue (cond2_expr); > > + if (FLOAT_TYPE_P (TREE_TYPE (cond2_expr))) > > + return; > > + } > > + > > + gsi = gsi_last_bb (imm_dominator); > > + gimple *last_stmt = *gsi; > > + > > + /* Create combined condition. */ > > + gimple_seq seq = NULL; > > + tree first_cond_lhs = make_ssa_name (boolean_type_node); > > + gimple_seq_add_stmt (&seq, gimple_build_assign (first_cond_lhs, > cond1_expr)); > > + tree sec_cond_lhs = make_ssa_name (boolean_type_node); > > + gassign *sec_cond_assign = gimple_build_assign (sec_cond_lhs, > cond2_expr); > > + gimple_seq_add_stmt (&seq, sec_cond_assign); > > + tree comb_cond_lhs = make_ssa_name (boolean_type_node); > > + gimple_seq_add_stmt (&seq, gimple_build_assign (comb_cond_lhs, > > + bool_op, > > + first_cond_lhs, > > + sec_cond_lhs)); > > + > > + /* Update immediate dominator's condition with the combined one. */ > > + gimple_set_location (sec_cond_assign, gimple_location > (non_imm_dom_cond)); > > + gsi_insert_seq_before (&gsi, seq, GSI_NEW_STMT); > > + gcond *last_stmt_cond = safe_dyn_cast <gcond *>(last_stmt); > > + > > + gimple_cond_set_condition (last_stmt_cond, EQ_EXPR, comb_cond_lhs, > > + build_int_cst (TREE_TYPE (comb_cond_lhs), 1)); > > + update_stmt (last_stmt_cond); > > + > > + /* Get the edges from the immediate dominator. */ > > + edge e_succ_non_imm = nullptr; > > + edge e_succ_other = nullptr; > > + FOR_EACH_EDGE (e, ei, imm_dominator->succs) > > + { > > + if (e->dest == non_imm_dom_block) > > + e_succ_non_imm = e; > > + else > > + e_succ_other = e; > > + } > > + > > + /* We're going to remove the block that contained the condition that > is > > + now merged in the immediate dominator, so redirect the edge that > was > > + pointing to this block. */ > > + edge e_red = nullptr; > > + FOR_EACH_EDGE (e, ei, e_succ_non_imm->dest->succs) > > + { > > + if (e->dest == bb) > > + continue; > > + e_red = redirect_edge_and_branch (e_succ_non_imm, e->dest); > > + } > > + > > + /* Update edge probabilities. */ > > + profile_probability new_prob; > > + profile_count src_count = e_succ_other->src->count; > > + profile_count dest_count = e_succ_other->dest->count; > > + if (src_count.nonzero_p () > > + && dest_count.nonzero_p ()) > > + new_prob = dest_count.probability_in (src_count); > > + else > > + new_prob = profile_probability::always ().apply_scale (1, 2); > > + > > + e_succ_other->probability = new_prob; > > + e_red->probability = new_prob.invert (); > > + > > + /* Remove the block. */ > > + mark_basic_block_deleted (non_imm_dom_block); > > + same_succ_flush_bb (non_imm_dom_block); > > + delete_basic_block (non_imm_dom_block); > > + > > + if (dump_file) > > + fprintf (dump_file, "Conditional statements %s and %s combined in > BB %d\n", > > + print_generic_expr_to_str (cond1_expr), > > + print_generic_expr_to_str (cond2_expr), > > + imm_dominator->index); > > + > > +} > > + > > /* For each cluster in all_clusters, merge all cluster->bbs. Returns > > number of bbs removed. */ > > > > @@ -1735,6 +1987,9 @@ apply_clusters (void) > > bitmap_clear_bit (update_bbs, bb1->index); > > > > replace_block_by (bb1, bb2); > > + > > + maybe_combine_conditions (bb2); > > + > > nr_bbs_removed++; > > } > > } > > @@ -1826,6 +2081,9 @@ tail_merge_optimize (bool need_crit_edge_split) > > } > > init_worklist (); > > > > + /* This is needed by maybe_combine_conditions. */ > > + mark_ssa_maybe_undefs (); > > + > > while (!worklist.is_empty ()) > > { > > if (!loop_entered) > > > > -- > Richard Biener <[email protected]> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg) >
