This proto patch addresses 70594, where the constexpr machinery causes
differences in -fcompare-debug checking. I'm looking for initial comments about
the approach this patch is taking.
As described in the comments of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70594 the change is being caused
by differing GC points with and without debug info generation. We generate
different patterns of DECL_UID.
With Patrick's patch to address 70542 constexpr_call_table is now gc-deletable,
and he introduced a cache for constexpr fn bodies, also gc-deletable.
This patch changes things so that both constexpr_call_table and
fundef_copies_table are no longer gc-deletable. Instead we simply count the
size of the constexpr_call_table and manually delete it if it gets 'too big'.
The bulk of this patch is adding gc machinery to the fundef_copies_table_t type,
as we now need to walk it during GC.
I modified maybe_initialize_constexpr_call_table to delete the table if it is
too big, and we're at the outermost level of a constexpr call evaluation stack.
I picked a hard coded value for reasons I discuss below. It would be nice to
delete the table only if we discover we're going to do call evaluation (rather
than hit an already cached value), but that was a little awkward. So I punted.
I've added a new hook to delete these two tables at the end of parsing. This is
called from c_parse_final_cleanup. There's no point having them hang around
during the rest of compilation.
In addition to bootstrapping and testing, I repeated Patrick's experiment of
building dwarf2out.c with -fcompare-debug. No changes were observed.
Observations:
1) I think this approach is not globally best. We're inventing another
memory-usage related heuristic, rather than relying on the already available GC
machinery. That's why I've gone with a hard coded value for now, rather than
add a new user-frobable param. The right solution is to stop generating new
DECL_UIDs when copying fns for constexpr evaluation. IIUC the UID is being used
to map decls to values. I don't see why we can't just use the (copied) DECL's
address as a key to find its associated value (but I've not investigated that
approach). It's not like we particularly care about the stability of that
mapping between different compilations on different machines -- it doesn't
affect code generation. Reproducibility of a class of compiler bugs would be
reduced, but probably only marginally so,
2) The constexpr_call_table and fundef_copies_table are closely related data
structures. I think it would be cleaner to combine these into a single data
structure. Possibly along with the call_stack and associated variables. Such a
cleanup would be good even if we don't go with this approach to the defect, but
could wait until 7.0. If we go with this approach, IMHO it would be good to do
some of this merging now.
3) In experimenting I looked at the 70542 testcase. Allowing the caches to grow
unboundedly we end up with about 500K constexpr_call_table slots used, 20
fundef_copies slots all taking up about 55MB (out of about 65MB) of gc-able memory.
In the 70594 testcase we use 127 constexpr_call_table slots, and about 1MB of
gcable memory.
Picking a limit of 20000 constexpr_call_table slots causes several reallocations
and interestingly actually led to faster parsing for 70542 of 6.8 seconds rather
than 7.0 seconds. Presumably by keeping the reachable working set smaller.
4) We could definitely do better with the when-to-delete heuristic. For
instance, not deleting if there's been no GC since the previous deletion event.
That would reduce the deletions I observed in the 70542 testcase, as there
were many more of them than GCs.
comments?
nathan
Index: cp/constexpr.c
===================================================================
--- cp/constexpr.c (revision 234878)
+++ cp/constexpr.c (working copy)
@@ -915,7 +915,7 @@ struct constexpr_ctx {
/* A table of all constexpr calls that have been evaluated by the
compiler in this translation unit. */
-static GTY ((deletable)) hash_table<constexpr_call_hasher> *constexpr_call_table;
+static GTY (()) hash_table<constexpr_call_hasher> *constexpr_call_table;
static tree cxx_eval_constant_expression (const constexpr_ctx *, tree,
bool, bool *, bool *, tree * = NULL);
@@ -956,19 +956,10 @@ constexpr_call_hasher::equal (constexpr_
return lhs_bindings == rhs_bindings;
}
-/* Initialize the constexpr call table, if needed. */
-
-static void
-maybe_initialize_constexpr_call_table (void)
-{
- if (constexpr_call_table == NULL)
- constexpr_call_table = hash_table<constexpr_call_hasher>::create_ggc (101);
-}
-
/* The representation of a single node in the per-function freelist maintained
by FUNDEF_COPIES_TABLE. */
-struct fundef_copy
+struct GTY((for_user)) fundef_copy
{
tree body;
tree parms;
@@ -986,20 +977,63 @@ struct fundef_copy
that's keyed off of the original FUNCTION_DECL and whose value is the chain
of this function's unused copies awaiting reuse. */
-struct fundef_copies_table_t
+struct GTY((user)) fundef_copies_table_t
{
- hash_map<tree, fundef_copy *> *map;
+ hash_map<tree, fundef_copy *> GTY((skip)) map;
+
+ fundef_copies_table_t (size_t size, bool ggc)
+ : map (size, ggc)
+ {
+ }
+
+ static fundef_copies_table_t *create_ggc (size_t size)
+ {
+ fundef_copies_table_t *table = ggc_alloc<fundef_copies_table_t> ();
+
+ new (table) fundef_copies_table_t (size, true);
+
+ return table;
+ }
};
-static GTY((deletable)) fundef_copies_table_t fundef_copies_table;
+static void gt_ggc_mx (fundef_copies_table_t *p);
+static void gt_pch_nx (fundef_copies_table_t *p);
+static void gt_pch_nx (fundef_copies_table_t *p, gt_pointer_operator, void *);
+
+static GTY(()) fundef_copies_table_t *fundef_copies_table;
+
+/* Initialize the constexpr call table, if needed. */
+
+static void
+maybe_initialize_constexpr_call_table (bool clearable)
+{
+ /* If the table has got too big, delete it. We have to do this
+ separately from GC because the act of evaluating functions can
+ create new DECL_UIDs and thus would perturb the behaviour of the
+ rest of compilation if it was GC-sensitive.
+
+ FIXME: Find a way of not generating new UIDS, so we can remove
+ this special case. That would be much preferable to exposing
+ another GC-like knob to the user. Hence the hard coded limit,
+ for now. */
+ if (constexpr_call_table && clearable
+ && constexpr_call_table->size () > 20000)
+ {
+ constexpr_call_table = NULL;
+ fundef_copies_table = NULL;
+ }
+
+ if (constexpr_call_table == NULL)
+ constexpr_call_table = hash_table<constexpr_call_hasher>::create_ggc (101);
+}
/* Initialize FUNDEF_COPIES_TABLE if it's not initialized. */
static void
maybe_initialize_fundef_copies_table ()
{
- if (fundef_copies_table.map == NULL)
- fundef_copies_table.map = hash_map<tree, fundef_copy *>::create_ggc (101);
+ if (fundef_copies_table == NULL)
+ fundef_copies_table = fundef_copies_table_t::create_ggc (101);
}
/* Reuse a copy or create a new unshared copy of the function FUN.
@@ -1011,7 +1045,7 @@ get_fundef_copy (tree fun)
maybe_initialize_fundef_copies_table ();
fundef_copy *copy;
- fundef_copy **slot = &fundef_copies_table.map->get_or_insert (fun, NULL);
+ fundef_copy **slot = &fundef_copies_table->map.get_or_insert (fun, NULL);
if (*slot == NULL)
{
copy = ggc_alloc<fundef_copy> ();
@@ -1032,7 +1066,7 @@ get_fundef_copy (tree fun)
static void
save_fundef_copy (tree fun, fundef_copy *copy)
{
- fundef_copy **slot = &fundef_copies_table.map->get_or_insert (fun, NULL);
+ fundef_copy **slot = &fundef_copies_table->map.get_or_insert (fun, NULL);
copy->prev = *slot;
*slot = copy;
}
@@ -1423,7 +1457,7 @@ cxx_eval_call_expression (const constexp
(new_call.bindings, constexpr_fundef_hasher::hash (new_call.fundef));
/* If we have seen this call before, we are done. */
- maybe_initialize_constexpr_call_table ();
+ maybe_initialize_constexpr_call_table (call_stack.length () == 1);
constexpr_call **slot
= constexpr_call_table->find_slot (&new_call, INSERT);
entry = *slot;
@@ -1434,7 +1468,7 @@ cxx_eval_call_expression (const constexp
*slot = entry = ggc_alloc<constexpr_call> ();
*entry = new_call;
}
- /* Calls which are in progress have their result set to NULL
+ /* Calls that are in progress have their result set to NULL,
so that we can detect circular dependencies. */
else if (entry->result == NULL)
{
@@ -2394,10 +2428,10 @@ cxx_eval_bare_aggregate (const constexpr
tree type = TREE_TYPE (t);
constexpr_ctx new_ctx;
- if (TYPE_PTRMEMFUNC_P (type))
+ if (TYPE_PTRMEMFUNC_P (type) || VECTOR_TYPE_P (type))
{
- /* We don't really need the ctx->ctor business for a PMF, but it's
- simpler to use the same code. */
+ /* We don't really need the ctx->ctor business for a PMF or
+ vector, but it's simpler to use the same code. */
new_ctx = *ctx;
new_ctx.ctor = build_constructor (type, NULL);
new_ctx.object = NULL_TREE;
@@ -5203,4 +5237,29 @@ require_potential_rvalue_constant_expres
return potential_constant_expression_1 (t, true, true, tf_warning_or_error);
}
+/* Finalize constexpr processing after parsing. */
+
+void
+fini_constexpr (void)
+{
+ /* The constexpr and fundef tables are no longer needed. */
+ constexpr_call_table = NULL;
+ fundef_copies_table = NULL;
+}
+
#include "gt-cp-constexpr.h"
+
+void gt_ggc_mx (fundef_copies_table_t *p)
+{
+ gt_ggc_mx (&p->map);
+}
+
+void gt_pch_nx (fundef_copies_table_t *p)
+{
+ gt_pch_nx (&p->map);
+}
+
+void gt_pch_nx (fundef_copies_table_t *p, gt_pointer_operator op, void *cookie)
+{
+ gt_pch_nx (&p->map, op, cookie);
+}
Index: cp/decl2.c
===================================================================
--- cp/decl2.c (revision 234878)
+++ cp/decl2.c (working copy)
@@ -4904,6 +4904,8 @@ c_parse_final_cleanups (void)
finish_repo ();
+ fini_constexpr ();
+
/* The entire file is now complete. If requested, dump everything
to a file. */
dump_tu ();
Index: cp/cp-tree.h
===================================================================
--- cp/cp-tree.h (revision 234878)
+++ cp/cp-tree.h (working copy)
@@ -6879,6 +6879,7 @@ bool cilkplus_an_triplet_types_ok_p
tree);
/* In constexpr.c */
+extern void fini_constexpr (void);
extern bool literal_type_p (tree);
extern tree register_constexpr_fundef (tree, tree);
extern bool check_constexpr_ctor_body (tree, tree, bool);