Hi,
this patch fixes remaining IPA profile updating issues I am aware of. With
this change there is 1 profile mismatch after inlining for tramp3d -Ofast
and 112 of them in .optimized dump which is about 10 times less than before.
I did not inspect them all but they seems mostly justified and not very
important.
First patch adds new profile quality stage when global profile is 0 but not
known precisely. This is useful for bb partitioning where we do not want to
move adjusted 0 to cold sections.
Second the patch adds little infrastructure for turning IPA info to local
info (profile_count::combine_with_ipa_count) and commonizes all places that
do this kind of operation.
Finally it fixes profile updating bug in ipa-split where entry and return
blocks got wrong profile.
Bootstrapped/regtested x86_64-linux. I am profilebootstrapping overnight
and plan to commit tomorrow.
* cgraphclones.c (cgraph_edge::clone): Rename gcov_count to prof_count.
(cgraph_edge::clone): Cleanup updating of profile.
* ipa-cp.c (update_profiling_info): Likewise.
* ipa-inline-transform.c (inline_transform): Likewise.
* ipa-inline.c (inline_small_functions): Add missing space to dump.
* ipa-split.c (execute_split_functions): Do not split when function
is cold.
* predict.c (estimate_bb_frequencies): Cleanup updating of profile.
* profile-count.c (profile_count::dump): Add global0.
(profile_count::to_cgraph_frequency): Do not ICE when entry is
undefined.
(profile_count::to_sreal_scale): Likewise.
(profile_count::adjust_for_ipa_scaling): Fix typo in comment.
(profile_count::combine_with_ipa_count): New function.
* profile-count.h (profile_guessed_global0adjusted): New.
(profile_count::adjusted_zero): New.
(profile_count::global0adjusted): New.
(profile_count::combine_with_ipa_count): New.
* tree-inline.c (copy_edges_for_bb): Add NUM/DEN arugment;
correct profile of return block of split functions.
(copy_cfg_body): Remove unused profile_count.
(copy_body): Likewise.
(expand_call_inline): Update.
(tree_function_versioning): Update.
Index: cgraphclones.c
===
--- cgraphclones.c (revision 254888)
+++ cgraphclones.c (working copy)
@@ -91,7 +91,7 @@ cgraph_edge::clone (cgraph_node *n, gcal
{
cgraph_edge *new_edge;
profile_count::adjust_for_ipa_scaling (, );
- profile_count gcov_count = count.apply_scale (num, den);
+ profile_count prof_count = count.apply_scale (num, den);
if (indirect_unknown_callee)
{
@@ -104,19 +104,19 @@ cgraph_edge::clone (cgraph_node *n, gcal
{
cgraph_node *callee = cgraph_node::get (decl);
gcc_checking_assert (callee);
- new_edge = n->create_edge (callee, call_stmt, gcov_count);
+ new_edge = n->create_edge (callee, call_stmt, prof_count);
}
else
{
new_edge = n->create_indirect_edge (call_stmt,
indirect_info->ecf_flags,
- gcov_count, false);
+ prof_count, false);
*new_edge->indirect_info = *indirect_info;
}
}
else
{
- new_edge = n->create_edge (callee, call_stmt, gcov_count);
+ new_edge = n->create_edge (callee, call_stmt, prof_count);
if (indirect_info)
{
new_edge->indirect_info
@@ -135,12 +135,9 @@ cgraph_edge::clone (cgraph_node *n, gcal
new_edge->in_polymorphic_cdtor = in_polymorphic_cdtor;
/* Update IPA profile. Local profiles need no updating in original. */
- if (update_original
- && count.ipa () == count && new_edge->count.ipa () == new_edge->count)
-count -= new_edge->count;
- else if (caller->count.global0 () == caller->count
- && !(count == profile_count::zero ()))
-count = count.global0 ();
+ if (update_original)
+count = count.combine_with_ipa_count (count.ipa ()
+ - new_edge->count.ipa ());
symtab->call_edge_duplication_hooks (this, new_edge);
return new_edge;
}
@@ -431,22 +428,12 @@ cgraph_node::create_clone (tree new_decl
if (new_inlined_to)
dump_callgraph_transformation (this, new_inlined_to, "inlining to");
- if (prof_count == profile_count::zero ()
- && !(count == profile_count::zero ()))
-prof_count = count.global0 ();
-
+ prof_count = count.combine_with_ipa_count (prof_count);
new_node->count = prof_count;
/* Update IPA profile. Local profiles need no updating in original. */
- if (update_original && !(count == profile_count::zero ())
- && count.ipa () == count && prof_count.ipa () == prof_count)
-{
- if (count.nonzero_p ()
- && !(count - prof_count).nonzero_p ())
- count = count.global0 ();
-