121345 - Improve PHI analyzer performance

Andrew MacLeod Thu, 20 Nov 2025 12:46:12 -0800

This is the delayed final patch for this PR that I mentioned would becoming. This reduces the performance loss of that patch set to somethingmore acceptable.


The general format ifor the single PHI "groups" is


    ssa_1 = ssa_2 OP const.
    ssa_2 = PHI <initial, ssa_1>

with some sort of backedge leading to the definition of ssa_1, and arange restriction imposed on ssa_2 on that edge.

The iterative analysis utilizes that ssa_2 edge range restriction toensure we don't generate ranges for ssa_1 that cannot feed into ssa_2.

I generated a lot of information about when we found and initial valueand when we didn't as well as the input values. Over a bootstrap, itgenerated 144,000 individual cases. I fed this into an AI and asked itquestions and had a discussion.

The first thing I did was check to see if we have a relation BEFOREdoing the iteration.. that gives us an instant answer cheaper.Previously we did all the iteration work, and only checked relationsafter iterations failed.

The end result is that 98% of all modifying statements feeding a PHI areeither PLUS or MINUS. not surprising. Also not surprising, itoften didn't help. Thios is , probably because the back edge often partof a loop and loop analsysis has already made some conclusions.

What did pop out is that if the edge range feeding back into thedefinition of ssa_1 has mutliple subranges, there is a dramaticallyimproved chance something good can happen. This situation means itsnot a straighforward loop backedge, and comprised only about 3% of theplus and minus cases. These 3% covered over 50% of all useful cases,so its a pretty cheap way to catch the majority.

And with all the other operations (shift, bitwise, etc) comprising only2% fo cases over all, it is worthwhile letting them them all through.

With the improved speedup, I also adjusted the iterative count from thearbitrary 10 that is was before, to be the number of bits in the word +1. This can help shifyts and multiplies especially, but seemedreasonable overall.

This take the 4.5% VRP slowdown pof the previous patch , and reduces itdown to 1.4%. If I left the iteration count at 10, that number dropsto 0.8%, but I feel that is reasonable. Easy to change if anyonedisagrees.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

From 862a5b00cfb8c3ef43358bc8f3d4a603496aa3e9 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <[email protected]>
Date: Thu, 20 Nov 2025 12:10:25 -0500
Subject: [PATCH] Improve PHI analyzer performance

Only do iterative analysis if it might be worthwhile.

	PR tree-optimization/121345
	gcc/
	* gimple-range-phi.cc (phi_group::calculate_using_modifier): Restore
	performance loss by being more selective when iterating.
---
 gcc/gimple-range-phi.cc | 48 +++++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/gcc/gimple-range-phi.cc b/gcc/gimple-range-phi.cc
index 85d0f279592..656d749d48d 100644
--- a/gcc/gimple-range-phi.cc
+++ b/gcc/gimple-range-phi.cc
@@ -127,29 +127,59 @@ phi_group::calculate_using_modifier (range_query *q)
   else
     return false;
 
-  // Examine modifier and run 10 iterations to see if it convergences.
+  // If we can resolve the range using relations, use that range.
+  if (refine_using_relation (k))
+    return true;
+
+  // Examine modifier and run iteration evaluations  to see if it convergences.
   // The constructor initilaized m_vr to the initial value already.
-  const unsigned num_iter = 10;
   int_range_max nv;
   int_range_max iter_value = m_vr;
   int_range_max iter_reach;
-  // Determie the maximum range of ITER reaching here.  Often a loop
-  // bound restricts the range reaching here.  Default to VARYING.
+
+  // Determine the maximum range of the var reaching here.  The back edge
+  // usually has a guard restircting the range. Default to VARYING.
   if (!m_modifier_name
       || !q->range_of_expr (iter_reach, m_modifier_name, m_modifier))
     iter_reach.set_varying (m_vr.type ());
+
+  // Iterative evaluation can be expensive, so heuristically :
+  //  - PLUS and MINUS are 98% of the cases, and usually handled well enough
+  //    by loop analysis, The exception is if the reaching range has multiple
+  //    subranges, those are often worth examining.
+  //  - All other operations (2%) are worth checking.
+  bool do_iterative = false;
+  if (gimple_code (m_modifier) == GIMPLE_ASSIGN)
+    switch (gimple_assign_rhs_code (m_modifier))
+      {
+	case PLUS_EXPR:
+	case MINUS_EXPR:
+	  if (iter_reach.num_pairs () > 1)
+	    do_iterative = true;
+	  break;
+	default:
+	  do_iterative = true;
+      }
+
+  // Limit iterations to 1 more than the number of bits.
+  unsigned num_iter;
+  if (do_iterative)
+    num_iter = TYPE_PRECISION (m_vr.type ()) + 1;
+  else
+    num_iter = 0;
+
   for (unsigned x = 0; x < num_iter; x++)
     {
       if (!fold_range (nv, m_modifier, iter_value, q))
 	break;
-      // Ensure nothing calculate is outside outside the reaching range.
+      // Ensure nothing calculated is outside outside the reaching range.
       nv.intersect (iter_reach);
       // If union does nothing, then we have convergence.
       if (!iter_value.union_ (nv))
 	{
 	  if (iter_value.varying_p ())
 	    break;
-	  // The last iteration Will reach the PHI node.
+	  // The last iteration Will also reach the PHI node, add it in.
 	  fold_range (nv, m_modifier, iter_value, q);
 	  iter_value.union_ (nv);
 	  m_vr = iter_value;
@@ -157,17 +187,13 @@ phi_group::calculate_using_modifier (range_query *q)
 	}
     }
 
-  // If we can resolve the range using relations, use that range.
-  if (refine_using_relation (k))
-    return true;
-
   // Never converged, so bail for now. we could examine the pattern
   // from m_initial to m_vr as an extension  Especially if we had a way
   // to project the actual number of iterations (SCEV?)
   //
   //  We can also try to identify "parallel" phis to get loop counts and
   //  determine the number of iterations of these parallel PHIs.
-  //
+
   return false;
 }
 
-- 
2.45.0

[COMMITTED 5/4] PR tree-optimization/121345 - Improve PHI analyzer performance

Reply via email to