I have been looking at -fdump-ipa-profile dump with an intention to sanitize bits of information so that one may use it to judge the "quality of a profile" in FDO.
The overall question I want to address is - are there ways to know which functions were not run in the training run, i.e. have ZERO profile ? (This patch corrects some dumped info; in a subsequent patch I would like to add some more explicit information/diagnostics.) Towards that end, I noticed that there are a couple of misleading bits of information (so I think) in the symbol table dump listing all functions in the compilation unit : --- "globally 0" appears even when profile data has not been fed by feedback profile (not the intent as the documentation of profile_guessed_global0 in profile-count.h suggests). --- "unlikely_executed" should appear only when there is profile feedback or a function attribute is specified (as per documentation of node_frequency in coretypes.h). "unlikely_executed" in case of STALE or NO profile is misleading in my opinion. Summary of changes : 1. This patch makes some adjustments around how x_profile_status of a function is set - x_profile_status should be set to PROFILE_READ only when there is a profile for a function read from the .gcda file. So, instead of relying on profile_info (set whenever the gcda feedback file is present, even if the function does not have a profile available in the file), use exec_counts (non null when function has a profile (the latter may or may not be zero)). In essence, x_profile_status and profile_count::m_quality are set consistent to the stated intent (in code comments.) 2. A minor change in coverage.c is for more precise location of the message Following -fdump-ipa-profile dump excerpts show the effect : ------------------------------------------------ -O1, -O2, -O3 ------------------------------------------------ 0. APPLICABLE PROFILE Trunk : Function flags: count:224114269 body hot After Patch : Function flags: count:224114269 (precise) body hot 1. STALE PROFILE (i.e., those cases covered by Wcoverage-mismatch; when control flow changes between profile-generate and profile-use) Trunk : Function flags: count:224114269 body hot After Patch : Function flags: count:224114269 (precise) body hot 2. NO PROFILE (i.e., those cases covered by Wmissing-profile; when function has no profile available in the .gcda file) Trunk (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body Trunk (missing function) : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed After Patch (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body After Patch (missing function) : Function flags: count:1073741824 (estimated locally) body 3. ZERO PROFILE (functions not run in training run) Trunk : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed After Patch (remains the same) : count: 1073741824 (estimated locally, globally 0) body unlikely_executed -------------------------------------------------- O0 -------------------------------------------------- In O0, flag_guess_branch_prob is not set. This makes the profile_quality set to (precise) for most of the above cases. 0. APPLICABLE PROFILE Trunk : Function flags: count:224114269 body hot After Patch : Function flags: count:224114269 (precise) body hot 1. STALE PROFILE (i.e., those cases covered by Wcoverage-mismatch; when control flow changes between profile-generate and profile-use) Trunk : Function flags: count:224114269 body hot After Patch : Function flags: count:224114269 (precise) body hot 2. NO PROFILE (i.e., those cases covered by Wmissing-profile; when function has no profile available in the .gcda file) Trunk (missing file) : Function flags: body Trunk (missing function) : Function flags: count:0 body unlikely_executed After Patch (missing file) : Function flags: body *** After Patch (missing function) : Function flags: count:0 (precise) body (*** This remains misleading, and I do not have a solution for this; as use of heuristics to guess branch probability is not allowed in O0) 3. ZERO PROFILE (functions not run in training run) Trunk : Function flags: count:0 body unlikely_executed After Patch : Function flags: count:0 (precise) body -------------------------------------------------- make check-gcc on x86_64 shows no new failures. (A related PR was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957 where we added diagnostics for the NO PROFILE case.)
diff --git a/gcc/coverage.c b/gcc/coverage.c index 599a3bb..7595e6c 100644 --- a/gcc/coverage.c +++ b/gcc/coverage.c @@ -358,7 +358,7 @@ get_coverage_counts (unsigned counter, unsigned cfg_checksum, if (warning_printed && dump_enabled_p ()) { dump_user_location_t loc - = dump_user_location_t::from_location_t (input_location); + = dump_user_location_t::from_function_decl (current_function_decl); dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc, "use -Wno-error=coverage-mismatch to tolerate " "the mismatch but performance may drop if the " diff --git a/gcc/profile-count.c b/gcc/profile-count.c index f4ab244..90f4feb 100644 --- a/gcc/profile-count.c +++ b/gcc/profile-count.c @@ -83,6 +83,8 @@ profile_count::dump (FILE *f) const fprintf (f, " (auto FDO)"); else if (m_quality == profile_guessed) fprintf (f, " (guessed)"); + else if (m_quality == profile_precise) + fprintf (f, " (precise)"); } } diff --git a/gcc/profile.c b/gcc/profile.c index 2130319..57e3f3c 100644 --- a/gcc/profile.c +++ b/gcc/profile.c @@ -698,6 +698,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum) } } + if (exec_counts) + { + profile_status_for_fn (cfun) = PROFILE_READ; + } + /* If we have real data, use them! */ if (bb_gcov_count (ENTRY_BLOCK_PTR_FOR_FN (cfun)) || !flag_guess_branch_prob) @@ -705,7 +710,7 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum) bb->count = profile_count::from_gcov_type (bb_gcov_count (bb)); /* If function was not trained, preserve local estimates including statically determined zero counts. */ - else + else if (profile_status_for_fn (cfun) == PROFILE_READ) FOR_ALL_BB_FN (bb, cfun) if (!(bb->count == profile_count::zero ())) bb->count = bb->count.global0 (); @@ -718,6 +723,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum) if (dump_file) { + fprintf (dump_file, " Profile feedback for function"); + fprintf (dump_file, ((profile_status_for_fn (cfun) == PROFILE_READ) + ? " is available \n" + : " is not available \n")); + fprintf (dump_file, "%d branches\n", num_branches); if (num_branches) for (i = 0; i < 10; i++) @@ -1317,12 +1327,12 @@ branch_prob (void) values.release (); free_edge_list (el); coverage_end_function (lineno_checksum, cfg_checksum); - if (flag_branch_probabilities && profile_info) + if (flag_branch_probabilities + && (profile_status_for_fn (cfun) == PROFILE_READ)) { struct loop *loop; if (dump_file && (dump_flags & TDF_DETAILS)) report_predictor_hitrates (); - profile_status_for_fn (cfun) = PROFILE_READ; /* At this moment we have precise loop iteration count estimates. Record them to loop structure before the profile gets out of date. */