Hi,
I tried to rebuild firefox without visibilities that leads to about 10fold
increase
in LTO time, mostly by ltrans. What happens is that by the old rule that
DECL_ONE_ONLY is duplicated into each partition that needs it we end up
duplicating
things everywhere. Moreover we mark tem all as used_from_ohter_partition and
that
makes us to not optimize them out from any of the partitions even if they are
dead.
I recall my analysis when I was adding this condition - I assumed that every
unit that uses COMDAT must also define it. This is wrong for keyed classes (as
was later corrected, but the not updated for forced_by_abi) and in fact when we
know that the COMDAT is going to be output to final file - i.e. linker plugin
tells us there is no one defining the symbol, we can decompose COMDAT group
early.
I also noticed bug in varasm where we ignore the fact that symbol is known to
be output
in other partition that causes us to use GOT/PLT references with this fix and
hidden
visibilities re-instantiated.
I believe this is the problem causing random jumps in Martin Liska's systetap
graphs.
Martin, is there any chance to generated updated graphs for firefox or other C++
monster, idealy both with -freorder-blocks-and-partition and with
-fno-reorder-blocks-and-partition?
Bootstrapped/regtested x86_64-linux, tested on firefox and comitted.
Honza
* lto/lto-partition.c (get_symbol_class): Only unforced DECL_ONE_ONLY
needs duplicating, not generic COMDAT.
* ipa.c (function_and_variable_visibility): Decompose DECL_ONE_ONLY
groups when we know they are controlled by LTO.
* varasm.c (default_binds_local_p_1): If object is in other partition,
it will be resolved locally.
Index: lto/lto-partition.c
===================================================================
--- lto/lto-partition.c (revision 207479)
+++ lto/lto-partition.c (working copy)
@@ -94,10 +94,12 @@ get_symbol_class (symtab_node *node)
else if (!cgraph (node)->definition)
return SYMBOL_EXTERNAL;
- /* Comdats are duplicated to every use unless they are keyed.
- Those do not need duplication. */
- if (DECL_COMDAT (node->decl)
+ /* Linker discardable symbols are duplicated to every use unless they are
+ keyed.
+ Keyed symbols or those. */
+ if (DECL_ONE_ONLY (node->decl)
&& !node->force_output
+ && !node->forced_by_abi
&& !symtab_used_from_object_file_p (node))
return SYMBOL_DUPLICATE;
Index: ipa.c
===================================================================
--- ipa.c (revision 207451)
+++ ipa.c (working copy)
@@ -1002,6 +1002,36 @@ function_and_variable_visibility (bool w
if (DECL_EXTERNAL (decl_node->decl))
DECL_EXTERNAL (node->decl) = 1;
}
+
+ /* If whole comdat group is used only within LTO code, we can dissolve
it,
+ we handle the unification ourselves.
+ We keep COMDAT and weak so visibility out of DSO does not change.
+ Later we may bring the symbols static if they are not exported. */
+ if (DECL_ONE_ONLY (node->decl)
+ && (node->resolution == LDPR_PREVAILING_DEF_IRONLY
+ || node->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP))
+ {
+ symtab_node *next = node;
+
+ if (node->same_comdat_group)
+ for (next = node->same_comdat_group;
+ next != node;
+ next = next->same_comdat_group)
+ if (next->externally_visible
+ && (next->resolution != LDPR_PREVAILING_DEF_IRONLY
+ && next->resolution != LDPR_PREVAILING_DEF_IRONLY_EXP))
+ break;
+ if (node == next)
+ {
+ if (node->same_comdat_group)
+ for (next = node->same_comdat_group;
+ next != node;
+ next = next->same_comdat_group)
+ DECL_COMDAT_GROUP (next->decl) = NULL;
+ DECL_COMDAT_GROUP (node->decl) = NULL;
+ symtab_dissolve_same_comdat_group_list (node);
+ }
+ }
}
FOR_EACH_DEFINED_FUNCTION (node)
{
Index: varasm.c
===================================================================
--- varasm.c (revision 207451)
+++ varasm.c (working copy)
@@ -6739,7 +6739,7 @@ default_binds_local_p_1 (const_tree exp,
&& (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
{
varpool_node *vnode = varpool_get_node (exp);
- if (vnode && resolution_local_p (vnode->resolution))
+ if (vnode && (resolution_local_p (vnode->resolution) ||
vnode->in_other_partition))
resolved_locally = true;
if (vnode
&& resolution_to_local_definition_p (vnode->resolution))
@@ -6749,7 +6749,7 @@ default_binds_local_p_1 (const_tree exp,
{
struct cgraph_node *node = cgraph_get_node (exp);
if (node
- && resolution_local_p (node->resolution))
+ && (resolution_local_p (node->resolution) ||
node->in_other_partition))
resolved_locally = true;
if (node
&& resolution_to_local_definition_p (node->resolution))