Artemiy Volkov <[email protected]> writes:
> On Mon, Feb 23, 2026 at 09:18:01PM +0000, Richard Sandiford wrote:
>> Artemiy Volkov <[email protected]> writes:
>> > On Wed, Feb 04, 2026 at 12:32:11AM +0000, Richard Sandiford wrote:
>> >> Artemiy Volkov <[email protected]> writes:
>> >> > [...]
>> >> > To address this, during initial traversal of a new EBB, we create an
>> >> > artificial live-out use at the bottom of the current BB when (a) a
>> >> > variable is live-in at the entry to the EBB, (b) it hasn't yet been
>> >> > redefined in the current EBB, and (c) it is redefined in the next BB.
>> >> > This live-out use serves as a barrier that restricts movement of
>> >> > new defs
>> >> > from later BBs having a new value to earlier ones still having the old
>> >> > live-in value.  In the diagram, this live-out use will be created
>> >> > for r110
>> >> > at the end of bb10.
>> >> >
>> >> > Since the def that this newly created live-out use sees has to be
>> >> > dominating (otherwise whenever the last use occurs after the
>> >> > current EBB,
>> >> > the movement will be restricted to an empty range), we create a
>> >> > new def at
>> >> > the beginning of the EBB.  For this, we use the create_reg_use ()
>> >> > function
>> >> > that will also create a degenerate PHI for this purpose when needed.
>> >> >
>> >> > One problem that we run into with this approach is that during RTL-SSA
>> >> > construction single-input PHI nodes are sometimes eliminated as
>> >> > unnecessary, which causes the last dominating def to be reverted to an
>> >> > earlier EBB and the move range restriction failing as before.  To remedy
>> >> > this, we mark PHIs that we create as 'persistent' and do not eliminate
>> >> > them where previously they would be replaced by their single input.  (To
>> >> > remain aligned with the design goal of minimizing memory overhead that
>> >> > RTL-SSA incurs, the m_has_been_superceded flag has been repurposed to
>> >> > track this in the phi_info class.)
>> >> >
>> >> > Bootstrapped and regtested on aarch64-linux-gnu; no performance
>> >> > regressions across SPEC2017.  New testcase added as provided by Alex
>> >> > Coplan in the BZ, many thanks for that.
>> >>
>> >> Thanks for looking at this and for the nice explanation.
>> >> I agree that the underlying problem is the lack of a live-out use.
>> >> However, rather than add a separate mechanism for this case, I think we
>> >> should instead force a phi to be created from the outset.  That will
>> >> make sure that EBBs with a single dominating live-in definition
>> >> (as here) are handled similarly to EBBs with potentially multiple
>> >> incoming live-in definitions.
>> >> 
>> >> Also, rather than add a flag to the phi, I think we can test whether
>> >> the next definition is in the same EBB.  This should also ensure that we
>> >> handle EBBs that start out with potentially multiple incoming live-in
>> >> definitions but eventually simplify to a single value.
>> >
>> > Hi Richard, apologies for the slow reply.
>> >
>> > I obviously agree that your approach is the right one to take.  I can
>> > confirm that your patch fixes the original testcase; what is still missing
>> > for it to pass bootstrap/regtest is handling of chains of degenerate PHIs
>> > which didn't occur before.  Where before your patch phi->input_value (0)
>> > would always point to a "real" definition (either an insn or a
>> > non-degenerate PHI), now a CFG can be conceived where you'd have to
>> > traverse a chain of (non-simplified) degenerate PHIs to get to that
>> > definition.
>> 
>> Argh.  I should have predicted that, sorry.
>> 
>> > I've made two main fixes on top of your patch to cater to this new
>> > possibility:  (1) in function_info::live_out_value (), when one degenerate
>> > PHI is feeding into another, we need to use the former as the input for
>> > the latter, as otherwise we'd be setting the latter's def to NULL; (2) in
>> > look_through_degenerate_phi (), we need to walk the PHI chain as explained
>> > before (this also necessitates a small assertion change in
>> > function_info::make_use_available ()).  There will be a performance
>> > penalty associated with this walk in the worst case, but I believe that
>> > this should not be a problem on real-life inputs.
>> 
>> I think look_through_degenerate_phi really does need to be O(1),
>> otherwise we risk quadratic complexity.
>
> Right, perhaps this needed to be a separate get_ultimate_def () function
> just for the needs of that one checking assert in
> function_info::make_use_available () ... 
>
>> 
>> So how about this compromise?  It goes back to your original approach
>> of detecting the problematic case while building the EBB.  But rather
>> than add the phis on-the-fly in record_block_live_out, the patch adds
>> them during add_phi_nodes.
>> 
>> This preserves the property that I was trying to get from the proposal
>> above, namely that we would add the same live-out uses as we would for
>> a normal phi.
>> 
>> For example, it means that if a register is live out from two blocks
>> in the EBB, we'd add live-out uses for both blocks.  It also means that
>> if:
>> 
>> - an EBB contains three blocks
>> - the register is live out of the first block and dies at that point
>> - the register is not referenced in the second block
>> - the register is defined in the third block
>> 
>> the first block would get the live-out use, rather than the second.
>> It would still be possible to move the definition up to the second
>> block, if that proves to be useful.
>
> I think the key part here that tripped me up were calls to
> replace_phi () during the dominator walk.  With those out of the way, it's
> certainly much easier to reason about the order in which things
> happen, and both approaches (add_phi_nodes ()- vs. place_phis ()-time)
> can probably be made to work.
>
>> 
>> Bootstrapped & regression-tested on x86_64-linux-gnu and
>> powerpc64le-linux-gnu.  Could you give it a spin on aarch64 too?
>
> No issues on aarch64-linux-gnu either, so AFAIC this should just go in
> as-is.

OK, great!  Thanks for the testing.

OK to install?  I've reattached the patch just in case.

Richard


>From 8fce3764039073296af11cd0dce9b482d6c844e4 Mon Sep 17 00:00:00 2001
From: Richard Sandiford <[email protected]>
Date: Mon, 23 Feb 2026 16:14:10 +0000
Subject: [PATCH] rtl-ssa: Ensure live-out uses before redefinitions [PR123786]

This patch fixes cases in which:

(1) a register is live in to an EBB;
(2) the register is live out of at least one BB in the EBB; and
(3) the register is redefined by a later BB in the same EBB.

We were supposed to create live-out uses for (2), so that the redefinition
in (3) cannot be moved up into the live range of (1).

The patch does this by collecting all definitions in second and
subsequence BBs of an EBB.  It then creates degenerate phis for those
registers that do not naturally need phis.  For speed and simplicity,
the patch does not check for (2).  If a register is live in to the EBB,
then it must be used somewhere, either in the EBB itself or in a
successor outside of the EBB.  A degenerate phi would eventually
be needed in either case.

This requires moving append_bb earlier, so that add_phi_nodes can
iterate over the BBs in an EBB.

live_out_value contained an on-the-fly optimisation to remove redundant
phis.  That was a mistake.  live_out_value can be called multiple times
for the same quantity.  Replacing a phi on-the-fly messes up bookkeeping
for second and subsequent calls.

The live_out_value optimisation was mostly geared towards memory.
As an experiment, I added an assert for when the optimisation applied
to registers.  It only fired once in an x86_64-linux-gnu bootstrap &
regression test, in gcc.dg/tree-prof/split-1.c.  That's a very poor
(but unsurprising) return.  And the optimisation will still be done
eventually anyway, during the phi simplification phase.  Doing it on
the fly was just supposed to allow the phi's memory to be reused.

The patch therefore moves the optimisation into add_phi_nodes and
restricts it to memory (for which it does make a difference).

gcc/
	PR rtl-optimization/123786
	* rtl-ssa/functions.h (function_info::live_out_value): Delete.
	(function_info::create_degenerate_phi): New overload.
	* rtl-ssa/blocks.cc (all_uses_are_live_out_uses): Delete.
	(function_info::live_out_value): Likewise.
	(function_info::replace_phi): Keep live-out uses if they are followed
	by a definition in the same EBB.
	(function_info::create_degenerate_phi): New overload, extracted
	from create_reg_use.
	(function_info::add_phi_nodes): Ensure that there is a phi for
	every live input that is redefined by a second or subsequent
	block in the EBB.  Record that such phis need live-out uses.
	(function_info::record_block_live_out): Use look_through_degenerate_phi
	rather than live_out_value when setting phi inputs.  Remove use of
	live_out_value for live-out uses.  Inline the old handling of
	bb_mem_live_out.
	(function_info::start_block): Move append_bb call to...
	(function_info::create_ebbs): ...here.
	* rtl-ssa/insns.cc (function_info::create_reg_use): Use the new
	create_degenerate_phi overload.

gcc/testsuite/
	PR rtl-optimization/123786
	* gcc.target/aarch64/pr123786.c: New test.

Co-authored-by: Artemiy Volkov <[email protected]>
---
 gcc/rtl-ssa/blocks.cc                       | 99 ++++++++++++---------
 gcc/rtl-ssa/functions.h                     |  2 +-
 gcc/rtl-ssa/insns.cc                        | 14 ++-
 gcc/testsuite/gcc.target/aarch64/pr123786.c | 38 ++++++++
 4 files changed, 101 insertions(+), 52 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr123786.c

diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc
index d445fcc2cf2..21ea9aa0452 100644
--- a/gcc/rtl-ssa/blocks.cc
+++ b/gcc/rtl-ssa/blocks.cc
@@ -327,38 +327,6 @@ function_info::add_live_out_use (bb_info *bb, set_info *def)
   add_use (use);
 }
 
-// Return true if all nondebug uses of DEF are live-out uses.
-static bool
-all_uses_are_live_out_uses (set_info *def)
-{
-  for (use_info *use : def->all_uses ())
-    if (!use->is_in_debug_insn () && !use->is_live_out_use ())
-      return false;
-  return true;
-}
-
-// SET, if nonnull, is a definition of something that is live out from BB.
-// Return the live-out value itself.
-set_info *
-function_info::live_out_value (bb_info *bb, set_info *set)
-{
-  // Degenerate phis only exist to provide a definition for uses in the
-  // same EBB.  The live-out value is the same as the live-in value.
-  if (auto *phi = safe_dyn_cast<phi_info *> (set))
-    if (phi->is_degenerate ())
-      {
-	set = phi->input_value (0);
-
-	// Remove the phi if it turned out to be useless.  This is
-	// mainly useful for memory, because we don't know ahead of time
-	// whether a block will use memory or not.
-	if (bb == bb->ebb ()->last_bb () && all_uses_are_live_out_uses (phi))
-	  replace_phi (phi, set);
-      }
-
-  return set;
-}
-
 // Make USE's definition available at USE, if it isn't already.  Assume that
 // the caller has properly used make_use_available to check that this is
 // possible.
@@ -454,7 +422,11 @@ function_info::replace_phi (phi_info *phi, set_info *new_value)
 
   if (new_value)
     for (use_info *use : phi->nondebug_insn_uses ())
-      if (!use->is_live_out_use ())
+      // Live-out uses act as a movement barrier for later definitions
+      // in the same EBB.
+      if (!use->is_live_out_use ()
+	  || (phi->next_def ()
+	      && phi->next_def ()->ebb () == phi->ebb ()))
 	{
 	  // We need to keep the phi around for its local uses.
 	  // Turn it into a degenerate phi, if it isn't already.
@@ -543,6 +515,17 @@ function_info::create_phi (ebb_info *ebb, resource_info resource,
   return phi;
 }
 
+// Called while building an extended basic block's SSA form using BI.
+// Create and return a degenerate phi whose single input is VALUE.
+phi_info *
+function_info::create_degenerate_phi (build_info &bi, set_info *value)
+{
+  access_info *inputs[] = { look_through_degenerate_phi (value) };
+  auto *phi = create_phi (bi.current_ebb, value->resource (), inputs, 1);
+  bi.record_reg_def (phi);
+  return phi;
+}
+
 // Create and return a degenerate phi for EBB whose input comes from DEF.
 // This is used in cases where DEF is known to be available on entry to
 // EBB but was not previously used within it.  If DEF is for a register,
@@ -822,6 +805,7 @@ void
 function_info::add_phi_nodes (build_info &bi)
 {
   ebb_info *ebb = bi.current_ebb;
+  bb_info *bb = ebb->first_bb ();
   basic_block cfg_bb = ebb->first_bb ()->cfg_bb ();
 
   // Create the register phis for this EBB.
@@ -846,6 +830,26 @@ function_info::add_phi_nodes (build_info &bi)
 
   bitmap_copy (bi.ebb_def_regs, &phis.regs);
 
+  if (bb->next_bb ())
+    {
+      // If a live input is redefined by later blocks in the same EBB, we need
+      // to add live-out uses to prevent the later definition from being moved
+      // into earlier blocks.  Create degenerate phis that can be used for
+      // that purpose, if phis are not naturally needed.
+      auto_bitmap extra_regs;
+      for (auto *bb2 : ebb->bbs ())
+	if (bb2 != bb)
+	  bitmap_ior_into (extra_regs, &DF_LR_BB_INFO (bb2->cfg_bb ())->def);
+      bitmap_and_compl_into (extra_regs, &phis.regs);
+      EXECUTE_IF_AND_IN_BITMAP (extra_regs, DF_LR_IN (cfg_bb), 0, regno, in_bi)
+	if (set_info *value = bi.current_reg_value (regno))
+	  {
+	    create_degenerate_phi (bi, value);
+	    // Record that live-out uses are needed for this phi.
+	    bitmap_set_bit (bi.ebb_def_regs, regno);
+	  }
+    }
+
   // Collect the live-in memory definitions and record whether they're
   // all the same.
   m_temp_defs.reserve (num_preds);
@@ -1021,8 +1025,12 @@ function_info::record_block_live_out (build_info &bi)
       bitmap_iterator out_bi;
       EXECUTE_IF_SET_IN_BITMAP (&phis.regs, 0, regno, out_bi)
 	{
-	  phis.inputs[input_i]
-	    = live_out_value (bb, bi.current_reg_value (regno));
+	  set_info *value = bi.current_reg_value (regno);
+	  // Degenerate phis only exist to provide a definition for uses in the
+	  // same EBB.  The live-out value is the same as the live-in value.
+	  if (value)
+	    value = look_through_degenerate_phi (value);
+	  phis.inputs[input_i] = value;
 	  input_i += 1;
 	}
     }
@@ -1039,7 +1047,7 @@ function_info::record_block_live_out (build_info &bi)
       bitmap_iterator out_bi;
       EXECUTE_IF_AND_IN_BITMAP (bi.ebb_def_regs, live_out, 0, regno, out_bi)
 	{
-	  set_info *value = live_out_value (bb, bi.current_reg_value (regno));
+	  set_info *value = bi.current_reg_value (regno);
 	  if (value && value->ebb () == ebb)
 	    add_live_out_use (bb, value);
 	}
@@ -1059,8 +1067,15 @@ function_info::record_block_live_out (build_info &bi)
       }
 
   // Record the live-out memory value.
-  bi.bb_mem_live_out[cfg_bb->index]
-    = live_out_value (bb, bi.current_mem_value ());
+  set_info *mem_value = bi.current_mem_value ();
+  if (auto *phi = safe_dyn_cast <phi_info *> (mem_value))
+    if (phi->is_degenerate ())
+      {
+	mem_value = phi->input_value (0);
+	if (bb == ebb->last_bb () && !phi->has_nondebug_uses ())
+	  replace_phi (phi, mem_value);
+      }
+  bi.bb_mem_live_out[cfg_bb->index] = mem_value;
 }
 
 // Add BB and its contents to the SSA information.
@@ -1079,9 +1094,6 @@ function_info::start_block (build_info &bi, bb_info *bb)
   // Record the start of this block's definitions in the definitions stack.
   bi.old_def_stack_limit.safe_push (bi.def_stack.length ());
 
-  // Add the block itself.
-  append_bb (bb);
-
   // If the block starts an EBB, create the phi insn.  This insn should exist
   // for all EBBs, even if they don't (yet) need phis.
   if (bb == ebb->first_bb ())
@@ -1299,7 +1311,10 @@ function_info::create_ebbs (build_info &bi)
 	// Create the EBB itself.
 	auto *ebb = allocate<ebb_info> (bbs[0], bbs.last ());
 	for (bb_info *bb : bbs)
-	  bb->set_ebb (ebb);
+	  {
+	    bb->set_ebb (ebb);
+	    append_bb (bb);
+	  }
 	bbs.truncate (0);
       }
 
diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h
index 63901a660da..9417046230e 100644
--- a/gcc/rtl-ssa/functions.h
+++ b/gcc/rtl-ssa/functions.h
@@ -308,7 +308,6 @@ private:
   void add_reg_unused_notes (insn_info *);
 
   void add_live_out_use (bb_info *, set_info *);
-  set_info *live_out_value (bb_info *, set_info *);
   void commit_make_use_available (use_info *);
 
   void append_phi (ebb_info *, phi_info *);
@@ -336,6 +335,7 @@ private:
   void create_ebbs (build_info &);
   void add_entry_block_defs (build_info &);
   void calculate_ebb_live_in_for_debug (build_info &);
+  phi_info *create_degenerate_phi (build_info &, set_info *);
   void add_phi_nodes (build_info &);
   void add_artificial_accesses (build_info &, df_ref_flags);
   void add_block_contents (build_info &);
diff --git a/gcc/rtl-ssa/insns.cc b/gcc/rtl-ssa/insns.cc
index 317a5809b18..b42acb9704b 100644
--- a/gcc/rtl-ssa/insns.cc
+++ b/gcc/rtl-ssa/insns.cc
@@ -482,15 +482,11 @@ function_info::create_reg_use (build_info &bi, insn_info *insn,
       if (insn->is_debug_insn ())
 	value = look_through_degenerate_phi (value);
       else if (bitmap_bit_p (bi.potential_phi_regs, resource.regno))
-	{
-	  // VALUE is defined by a previous EBB and RESOURCE has multiple
-	  // definitions.  Create a degenerate phi in the current EBB
-	  // so that all definitions and uses follow a linear RPO view;
-	  // see rtl.texi for details.
-	  access_info *inputs[] = { look_through_degenerate_phi (value) };
-	  value = create_phi (bi.current_ebb, value->resource (), inputs, 1);
-	  bi.record_reg_def (value);
-	}
+	// VALUE is defined by a previous EBB and RESOURCE has multiple
+	// definitions.  Create a degenerate phi in the current EBB
+	// so that all definitions and uses follow a linear RPO view;
+	// see rtl.texi for details.
+	value = create_degenerate_phi (bi, value);
     }
   auto *use = allocate<use_info> (insn, resource, value);
   add_use (use);
diff --git a/gcc/testsuite/gcc.target/aarch64/pr123786.c b/gcc/testsuite/gcc.target/aarch64/pr123786.c
new file mode 100644
index 00000000000..1b59f35949c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr123786.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mtune=generic-armv9-a" } */
+
+#include <arm_neon.h>
+
+int g_36[3];
+short g_s;
+signed char g_82, g_179, g_x, g_101;
+
+__attribute__((noipa))
+void check(void)
+{
+  if (g_36[0] != 1)
+    __builtin_abort ();
+}
+
+int main(void)
+{
+BS_LABEL_3:
+  short l_65 = g_s;
+  if (g_x) goto BS_LABEL_6;
+  for (; g_101; g_101 = 5)
+  {
+  BS_LABEL_6:
+  }
+  for (; g_82 < 1; g_82++)
+  {
+    switch (vqadds_u32 (0, 0))
+      {
+        case 4: goto BS_LABEL_3;
+        case 9: goto BS_LABEL_3;
+      }
+    short si1 = (l_65 &= 1) || (g_179 &= 0) != 6;
+    g_36[0] = g_36[2] == 0 ? 1 : si1 / g_36[2];
+  }
+
+  check ();
+}
-- 
2.53.0

Reply via email to