On Tue, 2 Apr 2024, Richard Biener wrote:

> The following reduces peak memory use for the PR114480 testcase at -O1
> which is almost exclusively spent by the ehcleanup pass in allocating
> PHI nodes.  The free_phinodes cache we maintain isn't very effective
> since it has effectively two slots, one for 4 and one for 9 argument
> PHIs and it is only ever used for allocations up to 9 arguments but
> we put all larger PHIs in the 9 argument bucket.  This proves
> uneffective resulting in much garbage to be kept when incrementally
> growing PHI nodes by edge redirection.
> 
> The mitigation is to rely on the GC freelist for larger sizes and
> thus immediately return all larger bucket sized PHIs to it via ggc_free.
> 
> This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
> from 359s to 168s.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK for trunk?  I'll leave more surgery for stage1.

Testing revealed on other use-after-free.  Revised patch as follows.

Richard.

>From 3507c14d05994eba5396492f08a919847b9e54ab Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Tue, 2 Apr 2024 12:31:04 +0200
Subject: [PATCH] tree-optimization/114557 - reduce ehcleanup peak memory use
To: gcc-patches@gcc.gnu.org

The following reduces peak memory use for the PR114480 testcase at -O1
which is almost exclusively spent by the ehcleanup pass in allocating
PHI nodes.  The free_phinodes cache we maintain isn't very effective
since it has effectively two slots, one for 4 and one for 9 argument
PHIs and it is only ever used for allocations up to 9 arguments but
we put all larger PHIs in the 9 argument bucket.  This proves
uneffective resulting in much garbage to be kept when incrementally
growing PHI nodes by edge redirection.

The mitigation is to rely on the GC freelist for larger sizes and
thus immediately return all larger bucket sized PHIs to it via ggc_free.

This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
from 359s to 168s.

        PR tree-optimization/114557
        PR tree-optimization/114480
        * tree-phinodes.cc (release_phi_node): Return PHIs from
        allocation buckets not covered by free_phinodes to GC.
        (remove_phi_node): Release the PHI LHS before freeing the
        PHI node.
        * tree-vect-loop.cc (vectorizable_live_operation): Get PHI lhs
        before releasing it.
---
 gcc/ggc-page.cc       |  6 ++++++
 gcc/tree-phinodes.cc  | 10 +++++++++-
 gcc/tree-vect-loop.cc |  2 +-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-phinodes.cc b/gcc/tree-phinodes.cc
index ddd731323e1..5a7e4a94e57 100644
--- a/gcc/tree-phinodes.cc
+++ b/gcc/tree-phinodes.cc
@@ -223,6 +223,14 @@ release_phi_node (gimple *phi)
       delink_imm_use (imm);
     }
 
+  /* Immediately return the memory to the allocator when we would
+     only ever re-use it for a smaller size allocation.  */
+  if (len - 2 >= NUM_BUCKETS - 2)
+    {
+      ggc_free (phi);
+      return;
+    }
+
   bucket = len > NUM_BUCKETS - 1 ? NUM_BUCKETS - 1 : len;
   bucket -= 2;
   vec_safe_push (free_phinodes[bucket], phi);
@@ -445,9 +453,9 @@ remove_phi_node (gimple_stmt_iterator *gsi, bool 
release_lhs_p)
 
   /* If we are deleting the PHI node, then we should release the
      SSA_NAME node so that it can be reused.  */
-  release_phi_node (phi);
   if (release_lhs_p)
     release_ssa_name (gimple_phi_result (phi));
+  release_phi_node (phi);
 }
 
 /* Remove all the phi nodes from BB.  */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index f33629e9b04..984636edbc5 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -10962,8 +10962,8 @@ vectorizable_live_operation (vec_info *vinfo, 
stmt_vec_info stmt_info,
                                                 lhs_type, &exit_gsi);
 
              auto gsi = gsi_for_stmt (use_stmt);
-             remove_phi_node (&gsi, false);
              tree lhs_phi = gimple_phi_result (use_stmt);
+             remove_phi_node (&gsi, false);
              gimple *copy = gimple_build_assign (lhs_phi, new_tree);
              gsi_insert_before (&exit_gsi, copy, GSI_SAME_STMT);
              break;
-- 
2.35.3

Reply via email to