On Thu, Apr 14, 2011 at 12:20 AM, Jan Hubicka <[email protected]> wrote:
> Hi,
> this patch moves inline_summary from field in cgraph_node into its own on side
> datastructure. This moves it from arcane decision of mine to split all IPA
> data
> into global/local datas stored in common datastructure into the scheme we
> developed for new IPA passes some time ago.
>
> The advantage is that the code is more contained and less spread across the
> compiler. We also make cgraph_node smaller and dumps more compact that never
> hurts.
>
> While working on it I noticed that Richi's patch to introduce cgraph_edge
> times/sizes is bit iffy in computing data when they are missing in the
> datastructure. Also it computes incomming edge costs instead of outgoing that
> leads to fact that not all edges gets their info computed for IPA inliner
> (think of newly discovered direct calls or IPA merging).
Ah, that was the reason ... I didn't dig deep enough ... ;)
>
> I fixed this on the and added sanity check that the fields are initialized.
> This has shown problem with early inliner iteration fixed thusly and fact that
> early inliner is attempting to compute overall growth at a time the inline
> parameters are not computed for functions not visited by early optimizations
> yet. We previously agreed that early inliner should not try to do that (as
> this
> leads to early inliner inlining functions called once that should be deferred
> for later consieration). I just hope it won't cause benchmarks to
> regress too much ;)
Yeah, we agreed to that. And I forgot about it as it wasn't part of the
early inliner reorg (which was supposed to be a 1:1 transform).
>
> Having place to pile inline analysis info in, there is more to cleanup. The
> cgraph_local/cgraph_global fields probably should go and the stuff from global
> info should go into inline_summary datastructure, too (the lifetimes are
> essentially the same so there is no need for the split). I will handle this
> incrementally.
>
> Bootstrapped/regtested x86_64-linux with slightly modified version of the
> patch.
> Re-testing with final version and intend to commit the patch tomorrow.
I looked over the patch and it looks ok to me.
Thanks,
Richard.
> Honza
>
> * cgraph.c (dump_cgraph_node): Do not dump inline summaries.
> * cgraph.h (struct inline_summary): Move to ipa-inline.h
> (cgraph_local_info): Remove inline_summary.
> * ipa-cp.c: Include ipa-inline.h.
> (ipcp_cloning_candidate_p, ipcp_estimate_growth,
> ipcp_estimate_cloning_cost, ipcp_insert_stage): Use inline_summary
> accesor.
> * lto-cgraph.c (lto_output_node): Do not stream inline summary.
> (input_overwrite_node): Do not set inline summary.
> (input_node): Do not stream inline summary.
> * ipa-inline.c (cgraph_decide_inlining): Dump inline summaries.
> (cgraph_decide_inlining_incrementally): Do not try to estimate overall
> growth; we do not have inline parameters computed for that anyway.
> (cgraph_early_inlining): After inlining compute call_stmt_sizes.
> * ipa-inline.h (struct inline_summary): Move here from ipa-inline.h
> (inline_summary_t): New type and VECtor.
> (debug_inline_summary, dump_inline_summaries): Declare.
> (inline_summary): Use VOCtor.
> (estimate_edge_growth): Kill hack computing call stmt size directly.
> * lto-section-in.c (lto_section_name): Add inline section.
> * ipa-inline-analysis.c: Include lto-streamer.h
> (node_removal_hook_holder, node_duplication_hook_holder): New holders
> (inline_node_removal_hook, inline_node_duplication_hook): New
> functions.
> (inline_summary_vec): Define.
> (inline_summary_alloc, dump_inline_summary, debug_inline_summary,
> dump_inline_summaries): New functions.
> (estimate_function_body_sizes): Properly compute size/time of outgoing
> calls.
> (compute_inline_parameters): Alloc inline_summary; do not compute
> size/time
> of incomming calls.
> (estimate_edge_time): Avoid missing time summary hack.
> (inline_read_summary): Read inline summary info.
> (inline_write_summary): Write inline summary info.
> (inline_free_summary): Free all hooks and inline summary vector.
> * lto-streamer.h: Add LTO_section_inline_summary section.
> * Makefile.in (ipa-cp.o, ipa-inline-analysis.o): Update dependencies.
> * ipa.c (cgraph_remove_unreachable_nodes): Fix dump file formating.
>
> * lto.c: Include ipa-inline.h
> (add_cgraph_node_to_partition, undo_partition): Use inline_summary
> accessor.
> (ipa_node_duplication_hook): Fix declaration.
> * Make-lang.in (lto.o): Update dependencies.
> Index: cgraph.c
> ===================================================================
> --- cgraph.c (revision 172396)
> +++ cgraph.c (working copy)
> @@ -1876,22 +1876,6 @@ dump_cgraph_node (FILE *f, struct cgraph
> if (node->count)
> fprintf (f, " executed "HOST_WIDEST_INT_PRINT_DEC"x",
> (HOST_WIDEST_INT)node->count);
> - if (node->local.inline_summary.self_time)
> - fprintf (f, " %i time, %i benefit", node->local.inline_summary.self_time,
> -
> node->local.inline_summary.time_inlining_benefit);
> - if (node->global.time && node->global.time
> - != node->local.inline_summary.self_time)
> - fprintf (f, " (%i after inlining)", node->global.time);
> - if (node->local.inline_summary.self_size)
> - fprintf (f, " %i size, %i benefit", node->local.inline_summary.self_size,
> -
> node->local.inline_summary.size_inlining_benefit);
> - if (node->global.size && node->global.size
> - != node->local.inline_summary.self_size)
> - fprintf (f, " (%i after inlining)", node->global.size);
> - if (node->local.inline_summary.estimated_self_stack_size)
> - fprintf (f, " %i bytes stack usage",
> (int)node->local.inline_summary.estimated_self_stack_size);
> - if (node->global.estimated_stack_size !=
> node->local.inline_summary.estimated_self_stack_size)
> - fprintf (f, " %i bytes after inlining",
> (int)node->global.estimated_stack_size);
> if (node->origin)
> fprintf (f, " nested in: %s", cgraph_node_name (node->origin));
> if (node->needed)
> Index: cgraph.h
> ===================================================================
> --- cgraph.h (revision 172396)
> +++ cgraph.h (working copy)
> @@ -58,23 +58,6 @@ struct lto_file_decl_data;
> extern const char * const cgraph_availability_names[];
> extern const char * const ld_plugin_symbol_resolution_names[];
>
> -/* Function inlining information. */
> -
> -struct GTY(()) inline_summary
> -{
> - /* Estimated stack frame consumption by the function. */
> - HOST_WIDE_INT estimated_self_stack_size;
> -
> - /* Size of the function body. */
> - int self_size;
> - /* How many instructions are likely going to disappear after inlining. */
> - int size_inlining_benefit;
> - /* Estimated time spent executing the function body. */
> - int self_time;
> - /* How much time is going to be saved by inlining. */
> - int time_inlining_benefit;
> -};
> -
> /* Information about thunk, used only for same body aliases. */
>
> struct GTY(()) cgraph_thunk_info {
> @@ -95,8 +78,6 @@ struct GTY(()) cgraph_local_info {
> /* File stream where this node is being written to. */
> struct lto_file_decl_data * lto_file_data;
>
> - struct inline_summary inline_summary;
> -
> /* Set when function function is visible in current compilation unit only
> and its address is never taken. */
> unsigned local : 1;
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c (revision 172396)
> +++ ipa-cp.c (working copy)
> @@ -148,6 +148,7 @@ along with GCC; see the file COPYING3.
> #include "tree-inline.h"
> #include "fibheap.h"
> #include "params.h"
> +#include "ipa-inline.h"
>
> /* Number of functions identified as candidates for cloning. When not cloning
> we can simplify iterate stage not forcing it to go through the decision
> @@ -495,7 +496,7 @@ ipcp_cloning_candidate_p (struct cgraph_
> cgraph_node_name (node));
> return false;
> }
> - if (node->local.inline_summary.self_size < n_calls)
> + if (inline_summary (node)->self_size < n_calls)
> {
> if (dump_file)
> fprintf (dump_file, "Considering %s for cloning; code would
> shrink.\n",
> @@ -1189,7 +1190,7 @@ ipcp_estimate_growth (struct cgraph_node
> call site. Precise cost is difficult to get, as our size metric counts
> constants and moves as free. Generally we are looking for cases that
> small function is called very many times. */
> - growth = node->local.inline_summary.self_size
> + growth = inline_summary (node)->self_size
> - removable_args * redirectable_node_callers;
> if (growth < 0)
> return 0;
> @@ -1229,7 +1230,7 @@ ipcp_estimate_cloning_cost (struct cgrap
> cost /= freq_sum * 1000 / REG_BR_PROB_BASE + 1;
> if (dump_file)
> fprintf (dump_file, "Cost of versioning %s is %i, (size: %i, freq: %i)\n",
> - cgraph_node_name (node), cost,
> node->local.inline_summary.self_size,
> + cgraph_node_name (node), cost, inline_summary (node)->self_size,
> freq_sum);
> return cost + 1;
> }
> @@ -1364,7 +1365,7 @@ ipcp_insert_stage (void)
> {
> if (node->count > max_count)
> max_count = node->count;
> - overall_size += node->local.inline_summary.self_size;
> + overall_size += inline_summary (node)->self_size;
> }
>
> max_new_size = overall_size;
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c (revision 172396)
> +++ lto-cgraph.c (working copy)
> @@ -465,16 +465,6 @@ lto_output_node (struct lto_simple_outpu
>
> if (tag == LTO_cgraph_analyzed_node)
> {
> - lto_output_sleb128_stream (ob->main_stream,
> -
> node->local.inline_summary.estimated_self_stack_size);
> - lto_output_sleb128_stream (ob->main_stream,
> - node->local.inline_summary.self_size);
> - lto_output_sleb128_stream (ob->main_stream,
> -
> node->local.inline_summary.size_inlining_benefit);
> - lto_output_sleb128_stream (ob->main_stream,
> - node->local.inline_summary.self_time);
> - lto_output_sleb128_stream (ob->main_stream,
> -
> node->local.inline_summary.time_inlining_benefit);
> if (node->global.inlined_to)
> {
> ref = lto_cgraph_encoder_lookup (encoder, node->global.inlined_to);
> @@ -930,23 +920,9 @@ input_overwrite_node (struct lto_file_de
> struct cgraph_node *node,
> enum LTO_cgraph_tags tag,
> struct bitpack_d *bp,
> - unsigned int stack_size,
> - unsigned int self_time,
> - unsigned int time_inlining_benefit,
> - unsigned int self_size,
> - unsigned int size_inlining_benefit,
> enum ld_plugin_symbol_resolution resolution)
> {
> node->aux = (void *) tag;
> - node->local.inline_summary.estimated_self_stack_size = stack_size;
> - node->local.inline_summary.self_time = self_time;
> - node->local.inline_summary.time_inlining_benefit = time_inlining_benefit;
> - node->local.inline_summary.self_size = self_size;
> - node->local.inline_summary.size_inlining_benefit = size_inlining_benefit;
> - node->global.time = self_time;
> - node->global.size = self_size;
> - node->global.estimated_stack_size = stack_size;
> - node->global.estimated_growth = INT_MIN;
> node->local.lto_file_data = file_data;
>
> node->local.local = bp_unpack_value (bp, 1);
> @@ -1023,13 +999,8 @@ input_node (struct lto_file_decl_data *f
> tree fn_decl;
> struct cgraph_node *node;
> struct bitpack_d bp;
> - int stack_size = 0;
> unsigned decl_index;
> int ref = LCC_NOT_FOUND, ref2 = LCC_NOT_FOUND;
> - int self_time = 0;
> - int self_size = 0;
> - int time_inlining_benefit = 0;
> - int size_inlining_benefit = 0;
> unsigned long same_body_count = 0;
> int clone_ref;
> enum ld_plugin_symbol_resolution resolution;
> @@ -1051,15 +1022,7 @@ input_node (struct lto_file_decl_data *f
> node->count_materialization_scale = lto_input_sleb128 (ib);
>
> if (tag == LTO_cgraph_analyzed_node)
> - {
> - stack_size = lto_input_sleb128 (ib);
> - self_size = lto_input_sleb128 (ib);
> - size_inlining_benefit = lto_input_sleb128 (ib);
> - self_time = lto_input_sleb128 (ib);
> - time_inlining_benefit = lto_input_sleb128 (ib);
> -
> - ref = lto_input_sleb128 (ib);
> - }
> + ref = lto_input_sleb128 (ib);
>
> ref2 = lto_input_sleb128 (ib);
>
> @@ -1073,9 +1036,7 @@ input_node (struct lto_file_decl_data *f
>
> bp = lto_input_bitpack (ib);
> resolution = (enum ld_plugin_symbol_resolution)lto_input_uleb128 (ib);
> - input_overwrite_node (file_data, node, tag, &bp, stack_size, self_time,
> - time_inlining_benefit, self_size,
> - size_inlining_benefit, resolution);
> + input_overwrite_node (file_data, node, tag, &bp, resolution);
>
> /* Store a reference for now, and fix up later to be a pointer. */
> node->global.inlined_to = (cgraph_node_ptr) (intptr_t) ref;
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c (revision 172396)
> +++ ipa-inline.c (working copy)
> @@ -1301,6 +1301,9 @@ cgraph_decide_inlining (void)
> max_benefit = benefit;
> }
> }
> +
> + if (dump_file)
> + dump_inline_summaries (dump_file);
> gcc_assert (in_lto_p
> || !max_count
> || (profile_info && flag_branch_probabilities));
> @@ -1558,8 +1561,7 @@ cgraph_decide_inlining_incrementally (st
> /* When the function body would grow and inlining the function
> won't eliminate the need for offline copy of the function,
> don't inline. */
> - if (estimate_edge_growth (e) > allowed_growth
> - && estimate_growth (e->callee) > allowed_growth)
> + if (estimate_edge_growth (e) > allowed_growth)
> {
> if (dump_file)
> fprintf (dump_file,
> @@ -1601,6 +1603,7 @@ static unsigned int
> cgraph_early_inlining (void)
> {
> struct cgraph_node *node = cgraph_get_node (current_function_decl);
> + struct cgraph_edge *edge;
> unsigned int todo = 0;
> int iterations = 0;
> bool inlined = false;
> @@ -1652,6 +1655,19 @@ cgraph_early_inlining (void)
> {
> timevar_push (TV_INTEGRATION);
> todo |= optimize_inline_calls (current_function_decl);
> +
> + /* Technically we ought to recompute inline parameters so the new
> iteration of
> + early inliner works as expected. We however have values
> approximately right
> + and thus we only need to update edge info that might be cleared out
> for
> + newly discovered edges. */
> + for (edge = node->callees; edge; edge = edge->next_callee)
> + {
> + edge->call_stmt_size
> + = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> + edge->call_stmt_time
> + = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> + }
> +
> timevar_pop (TV_INTEGRATION);
> }
>
> Index: ipa-inline.h
> ===================================================================
> --- ipa-inline.h (revision 172396)
> +++ ipa-inline.h (working copy)
> @@ -19,6 +19,30 @@ You should have received a copy of the G
> along with GCC; see the file COPYING3. If not see
> <http://www.gnu.org/licenses/>. */
>
> +/* Function inlining information. */
> +
> +struct inline_summary
> +{
> + /* Estimated stack frame consumption by the function. */
> + HOST_WIDE_INT estimated_self_stack_size;
> +
> + /* Size of the function body. */
> + int self_size;
> + /* How many instructions are likely going to disappear after inlining. */
> + int size_inlining_benefit;
> + /* Estimated time spent executing the function body. */
> + int self_time;
> + /* How much time is going to be saved by inlining. */
> + int time_inlining_benefit;
> +};
> +
> +typedef struct inline_summary inline_summary_t;
> +DEF_VEC_O(inline_summary_t);
> +DEF_VEC_ALLOC_O(inline_summary_t,heap);
> +extern VEC(inline_summary_t,heap) *inline_summary_vec;
> +
> +void debug_inline_summary (struct cgraph_node *);
> +void dump_inline_summaries (FILE *f);
> void inline_generate_summary (void);
> void inline_read_summary (void);
> void inline_write_summary (cgraph_node_set, varpool_node_set);
> @@ -30,7 +54,7 @@ int estimate_growth (struct cgraph_node
> static inline struct inline_summary *
> inline_summary (struct cgraph_node *node)
> {
> - return &node->local.inline_summary;
> + return VEC_index (inline_summary_t, inline_summary_vec, node->uid);
> }
>
> /* Estimate the growth of the caller when inlining EDGE. */
> @@ -39,12 +63,8 @@ static inline int
> estimate_edge_growth (struct cgraph_edge *edge)
> {
> int call_stmt_size;
> - /* ??? We throw away cgraph edges all the time so the information
> - we store in edges doesn't persist for early inlining. Ugh. */
> - if (!edge->call_stmt)
> - call_stmt_size = edge->call_stmt_size;
> - else
> - call_stmt_size = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> + call_stmt_size = edge->call_stmt_size;
> + gcc_checking_assert (call_stmt_size);
> return (edge->callee->global.size
> - inline_summary (edge->callee)->size_inlining_benefit
> - call_stmt_size);
> Index: lto-section-in.c
> ===================================================================
> --- lto-section-in.c (revision 172396)
> +++ lto-section-in.c (working copy)
> @@ -58,7 +58,8 @@ const char *lto_section_name[LTO_N_SECTI
> "reference",
> "symtab",
> "opts",
> - "cgraphopt"
> + "cgraphopt",
> + "inline"
> };
>
> unsigned char
> Index: ipa.c
> ===================================================================
> --- ipa.c (revision 172396)
> +++ ipa.c (working copy)
> @@ -517,6 +517,8 @@ cgraph_remove_unreachable_nodes (bool be
> }
> }
> }
> + if (file)
> + fprintf (file, "\n");
>
> #ifdef ENABLE_CHECKING
> verify_cgraph ();
> Index: ipa-inline-analysis.c
> ===================================================================
> --- ipa-inline-analysis.c (revision 172396)
> +++ ipa-inline-analysis.c (working copy)
> @@ -23,13 +23,13 @@ along with GCC; see the file COPYING3.
>
> We estimate for each function
> - function body size
> - - function runtime
> + - average function execution time
> - inlining size benefit (that is how much of function body size
> and its call sequence is expected to disappear by inlining)
> - inlining time benefit
> - function frame size
> For each call
> - - call sequence size
> + - call statement size and time
>
> inlinie_summary datastructures store above information locally (i.e.
> parameters of the function itself) and globally (i.e. parameters of
> @@ -61,12 +61,100 @@ along with GCC; see the file COPYING3.
> #include "ggc.h"
> #include "tree-flow.h"
> #include "ipa-prop.h"
> +#include "lto-streamer.h"
> #include "ipa-inline.h"
>
> #define MAX_TIME 1000000000
>
> /* Holders of ipa cgraph hooks: */
> static struct cgraph_node_hook_list *function_insertion_hook_holder;
> +static struct cgraph_node_hook_list *node_removal_hook_holder;
> +static struct cgraph_2node_hook_list *node_duplication_hook_holder;
> +static void inline_node_removal_hook (struct cgraph_node *, void *);
> +static void inline_node_duplication_hook (struct cgraph_node *,
> + struct cgraph_node *, void *);
> +
> +/* VECtor holding inline summaries. */
> +VEC(inline_summary_t,heap) *inline_summary_vec;
> +
> +/* Allocate the inline summary vector or resize it to cover all cgraph
> nodes. */
> +
> +static void
> +inline_summary_alloc (void)
> +{
> + if (!node_removal_hook_holder)
> + node_removal_hook_holder =
> + cgraph_add_node_removal_hook (&inline_node_removal_hook, NULL);
> + if (!node_duplication_hook_holder)
> + node_duplication_hook_holder =
> + cgraph_add_node_duplication_hook (&inline_node_duplication_hook, NULL);
> +
> + if (VEC_length (inline_summary_t, inline_summary_vec)
> + <= (unsigned) cgraph_max_uid)
> + VEC_safe_grow_cleared (inline_summary_t, heap,
> + inline_summary_vec, cgraph_max_uid + 1);
> +}
> +
> +/* Hook that is called by cgraph.c when a node is removed. */
> +
> +static void
> +inline_node_removal_hook (struct cgraph_node *node, void *data
> ATTRIBUTE_UNUSED)
> +{
> + /* During IPA-CP updating we can be called on not-yet analyze clones. */
> + if (VEC_length (inline_summary_t, inline_summary_vec)
> + <= (unsigned)node->uid)
> + return;
> + memset (inline_summary (node),
> + 0, sizeof (inline_summary_t));
> +}
> +
> +/* Hook that is called by cgraph.c when a node is duplicated. */
> +
> +static void
> +inline_node_duplication_hook (struct cgraph_node *src, struct cgraph_node
> *dst,
> + ATTRIBUTE_UNUSED void *data)
> +{
> + inline_summary_alloc ();
> + memcpy (inline_summary (dst), inline_summary (src),
> + sizeof (struct inline_summary));
> +}
> +
> +static void
> +dump_inline_summary (FILE *f, struct cgraph_node *node)
> +{
> + if (node->analyzed)
> + {
> + struct inline_summary *s = inline_summary (node);
> + fprintf (f, "Inline summary for %s/%i\n", cgraph_node_name (node),
> + node->uid);
> + fprintf (f, " self time: %i, benefit: %i\n",
> + s->self_time, s->time_inlining_benefit);
> + fprintf (f, " global time: %i\n", node->global.time);
> + fprintf (f, " self size: %i, benefit: %i\n",
> + s->self_size, s->size_inlining_benefit);
> + fprintf (f, " global size: %i", node->global.size);
> + fprintf (f, " self stack: %i\n",
> + (int)s->estimated_self_stack_size);
> + fprintf (f, " global stack: %i\n",
> + (int)node->global.estimated_stack_size);
> + }
> +}
> +
> +void
> +debug_inline_summary (struct cgraph_node *node)
> +{
> + dump_inline_summary (stderr, node);
> +}
> +
> +void
> +dump_inline_summaries (FILE *f)
> +{
> + struct cgraph_node *node;
> +
> + for (node = cgraph_nodes; node; node = node->next)
> + if (node->analyzed)
> + dump_inline_summary (f, node);
> +}
>
> /* See if statement might disappear after inlining.
> 0 - means not eliminated
> @@ -179,16 +267,27 @@ estimate_function_body_sizes (struct cgr
> freq, this_size, this_time);
> print_gimple_stmt (dump_file, stmt, 0, 0);
> }
> +
> + if (is_gimple_call (stmt))
> + {
> + struct cgraph_edge *edge = cgraph_edge (node, stmt);
> + edge->call_stmt_size = this_size;
> + edge->call_stmt_time = this_time;
> + }
> +
> this_time *= freq;
> time += this_time;
> size += this_size;
> +
> prob = eliminated_by_inlining_prob (stmt);
> if (prob == 1 && dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, " 50%% will be eliminated by inlining\n");
> if (prob == 2 && dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, " will eliminated by inlining\n");
> +
> size_inlining_benefit += this_size * prob;
> time_inlining_benefit += this_time * prob;
> +
> gcc_assert (time >= 0);
> gcc_assert (size >= 0);
> }
> @@ -222,6 +321,8 @@ compute_inline_parameters (struct cgraph
>
> gcc_assert (!node->global.inlined_to);
>
> + inline_summary_alloc ();
> +
> /* Estimate the stack size for the function if we're optimizing. */
> self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
> inline_summary (node)->estimated_self_stack_size = self_stack_size;
> @@ -247,17 +348,7 @@ compute_inline_parameters (struct cgraph
> node->local.can_change_signature = !e;
> }
> estimate_function_body_sizes (node);
> - /* Compute size of call statements. We have to do this for callers here,
> - those sizes need to be present for edges _to_ us as early as
> - we are finished with early opts. */
> - for (e = node->callers; e; e = e->next_caller)
> - if (e->call_stmt)
> - {
> - e->call_stmt_size
> - = estimate_num_insns (e->call_stmt, &eni_size_weights);
> - e->call_stmt_time
> - = estimate_num_insns (e->call_stmt, &eni_time_weights);
> - }
> +
> /* Inlining characteristics are maintained by the cgraph_mark_inline. */
> node->global.time = inline_summary (node)->self_time;
> node->global.size = inline_summary (node)->self_size;
> @@ -300,12 +391,8 @@ static inline int
> estimate_edge_time (struct cgraph_edge *edge)
> {
> int call_stmt_time;
> - /* ??? We throw away cgraph edges all the time so the information
> - we store in edges doesn't persist for early inlining. Ugh. */
> - if (!edge->call_stmt)
> - call_stmt_time = edge->call_stmt_time;
> - else
> - call_stmt_time = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> + call_stmt_time = edge->call_stmt_time;
> + gcc_checking_assert (call_stmt_time);
> return (((gcov_type)edge->callee->global.time
> - inline_summary (edge->callee)->time_inlining_benefit
> - call_stmt_time) * edge->frequency
> @@ -379,8 +466,10 @@ estimate_growth (struct cgraph_node *nod
> return growth;
> }
>
> +
> /* This function performs intraprocedural analysis in NODE that is required
> to
> inline indirect calls. */
> +
> static void
> inline_indirect_intraprocedural_analysis (struct cgraph_node *node)
> {
> @@ -437,8 +526,6 @@ inline_generate_summary (void)
> for (node = cgraph_nodes; node; node = node->next)
> if (node->analyzed)
> inline_analyze_function (node);
> -
> - return;
> }
>
>
> @@ -449,6 +536,57 @@ inline_generate_summary (void)
> void
> inline_read_summary (void)
> {
> + struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
> + struct lto_file_decl_data *file_data;
> + unsigned int j = 0;
> +
> + inline_summary_alloc ();
> +
> + while ((file_data = file_data_vec[j++]))
> + {
> + size_t len;
> + const char *data = lto_get_section_data (file_data,
> LTO_section_inline_summary, NULL, &len);
> +
> + struct lto_input_block *ib
> + = lto_create_simple_input_block (file_data,
> + LTO_section_inline_summary,
> + &data, &len);
> + if (ib)
> + {
> + unsigned int i;
> + unsigned int f_count = lto_input_uleb128 (ib);
> +
> + for (i = 0; i < f_count; i++)
> + {
> + unsigned int index;
> + struct cgraph_node *node;
> + struct inline_summary *info;
> + lto_cgraph_encoder_t encoder;
> +
> + index = lto_input_uleb128 (ib);
> + encoder = file_data->cgraph_node_encoder;
> + node = lto_cgraph_encoder_deref (encoder, index);
> + info = inline_summary (node);
> +
> + node->global.estimated_stack_size
> + = info->estimated_self_stack_size = lto_input_uleb128 (ib);
> + node->global.time = info->self_time = lto_input_uleb128 (ib);
> + info->time_inlining_benefit = lto_input_uleb128 (ib);
> + node->global.size = info->self_size = lto_input_uleb128 (ib);
> + info->size_inlining_benefit = lto_input_uleb128 (ib);
> + node->global.estimated_growth = INT_MIN;
> + }
> +
> + lto_destroy_simple_input_block (file_data,
> + LTO_section_inline_summary,
> + ib, data, len);
> + }
> + else
> + /* Fatal error here. We do not want to support compiling ltrans
> units with
> + different version of compiler or different flags than the WPA
> unit, so
> + this should never happen. */
> + fatal_error ("ipa reference summary is missing in ltrans unit");
> + }
> if (flag_indirect_inlining)
> {
> ipa_register_cgraph_hooks ();
> @@ -468,14 +606,57 @@ void
> inline_write_summary (cgraph_node_set set,
> varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> + struct cgraph_node *node;
> + struct lto_simple_output_block *ob
> + = lto_create_simple_output_block (LTO_section_inline_summary);
> + lto_cgraph_encoder_t encoder = ob->decl_state->cgraph_node_encoder;
> + unsigned int count = 0;
> + int i;
> +
> + for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
> + if (lto_cgraph_encoder_deref (encoder, i)->analyzed)
> + count++;
> + lto_output_uleb128_stream (ob->main_stream, count);
> +
> + for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
> + {
> + node = lto_cgraph_encoder_deref (encoder, i);
> + if (node->analyzed)
> + {
> + struct inline_summary *info = inline_summary (node);
> + lto_output_uleb128_stream (ob->main_stream,
> + lto_cgraph_encoder_encode (encoder,
> node));
> + lto_output_sleb128_stream (ob->main_stream,
> + info->estimated_self_stack_size);
> + lto_output_sleb128_stream (ob->main_stream,
> + info->self_size);
> + lto_output_sleb128_stream (ob->main_stream,
> + info->size_inlining_benefit);
> + lto_output_sleb128_stream (ob->main_stream,
> + info->self_time);
> + lto_output_sleb128_stream (ob->main_stream,
> + info->time_inlining_benefit);
> + }
> + }
> +
> if (flag_indirect_inlining && !flag_ipa_cp)
> ipa_prop_write_jump_functions (set);
> }
>
> +
> /* Release inline summary. */
>
> void
> inline_free_summary (void)
> {
> - cgraph_remove_function_insertion_hook (function_insertion_hook_holder);
> + if (function_insertion_hook_holder)
> + cgraph_remove_function_insertion_hook (function_insertion_hook_holder);
> + function_insertion_hook_holder = NULL;
> + if (node_removal_hook_holder)
> + cgraph_remove_node_removal_hook (node_removal_hook_holder);
> + node_removal_hook_holder = NULL;
> + if (node_duplication_hook_holder)
> + cgraph_remove_node_duplication_hook (node_duplication_hook_holder);
> + node_duplication_hook_holder = NULL;
> + VEC_free (inline_summary_t, heap, inline_summary_vec);
> }
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c (revision 172396)
> +++ lto/lto.c (working copy)
> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.
> #include "lto-streamer.h"
> #include "splay-tree.h"
> #include "params.h"
> +#include "ipa-inline.h"
>
> static GTY(()) tree first_personality_decl;
>
> @@ -750,7 +751,7 @@ add_cgraph_node_to_partition (ltrans_par
> {
> struct cgraph_edge *e;
>
> - part->insns += node->local.inline_summary.self_size;
> + part->insns += inline_summary (node)->self_size;
>
> if (node->aux)
> {
> @@ -811,7 +812,7 @@ undo_partition (ltrans_partition partiti
> struct cgraph_node *node = VEC_index (cgraph_node_ptr,
> partition->cgraph_set->nodes,
> n_cgraph_nodes);
> - partition->insns -= node->local.inline_summary.self_size;
> + partition->insns -= inline_summary (node)->self_size;
> cgraph_node_set_remove (partition->cgraph_set, node);
> node->aux = (void *)((size_t)node->aux - 1);
> }
> Index: lto/Make-lang.in
> ===================================================================
> --- lto/Make-lang.in (revision 172396)
> +++ lto/Make-lang.in (working copy)
> @@ -85,7 +85,8 @@ lto/lto.o: lto/lto.c $(CONFIG_H) $(SYSTE
> $(CGRAPH_H) $(GGC_H) tree-ssa-operands.h $(TREE_PASS_H) \
> langhooks.h $(VEC_H) $(BITMAP_H) pointer-set.h $(IPA_PROP_H) \
> $(COMMON_H) debug.h $(TIMEVAR_H) $(GIMPLE_H) $(LTO_H) $(LTO_TREE_H) \
> - $(LTO_TAGS_H) $(LTO_STREAMER_H) $(SPLAY_TREE_H) gt-lto-lto.h
> $(PARAMS_H)
> + $(LTO_TAGS_H) $(LTO_STREAMER_H) $(SPLAY_TREE_H) gt-lto-lto.h
> $(PARAMS_H) \
> + ipa-inline.h
> lto/lto-object.o: lto/lto-object.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(DIAGNOSTIC_CORE_H) $(LTO_H) $(TM_H) $(LTO_STREAMER_H) \
> ../include/simple-object.h
> Index: ipa-prop.c
> ===================================================================
> --- ipa-prop.c (revision 172396)
> +++ ipa-prop.c (working copy)
> @@ -1998,7 +1998,7 @@ ipa_edge_duplication_hook (struct cgraph
>
> static void
> ipa_node_duplication_hook (struct cgraph_node *src, struct cgraph_node *dst,
> - __attribute__((unused)) void *data)
> + ATTRIBUTE_UNUSED void *data)
> {
> struct ipa_node_params *old_info, *new_info;
> int param_count, i;
> Index: Makefile.in
> ===================================================================
> --- Makefile.in (revision 172396)
> +++ Makefile.in (working copy)
> @@ -3011,7 +3011,7 @@ ipa-ref.o : ipa-ref.c $(CONFIG_H) $(SYST
> ipa-cp.o : ipa-cp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(TREE_H) $(TARGET_H) $(GIMPLE_H) $(CGRAPH_H) $(IPA_PROP_H) $(TREE_FLOW_H)
> \
> $(TREE_PASS_H) $(FLAGS_H) $(TIMEVAR_H) $(DIAGNOSTIC_H) $(TREE_DUMP_H) \
> - $(TREE_INLINE_H) $(FIBHEAP_H) $(PARAMS_H) tree-pretty-print.h
> + $(TREE_INLINE_H) $(FIBHEAP_H) $(PARAMS_H) tree-pretty-print.h ipa-inline.h
> ipa-split.o : ipa-split.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(TREE_H) $(TARGET_H) $(CGRAPH_H) $(IPA_PROP_H) $(TREE_FLOW_H) \
> $(TREE_PASS_H) $(FLAGS_H) $(TIMEVAR_H) $(DIAGNOSTIC_H) $(TREE_DUMP_H) \
> @@ -3032,7 +3032,7 @@ ipa-inline-analysis.o : ipa-inline-analy
> $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
> $(DIAGNOSTIC_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
> $(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(IPA_PROP_H) \
> - gimple-pretty-print.h ipa-inline.h
> + gimple-pretty-print.h ipa-inline.h $(LTO_STREAMER_H)
> ipa-utils.o : ipa-utils.c $(IPA_UTILS_H) $(CONFIG_H) $(SYSTEM_H) \
> coretypes.h $(TM_H) $(TREE_H) $(TREE_FLOW_H) $(TREE_INLINE_H) langhooks.h \
> pointer-set.h $(GGC_H) $(GIMPLE_H) $(SPLAY_TREE_H) \
> Index: lto-streamer.h
> ===================================================================
> --- lto-streamer.h (revision 172396)
> +++ lto-streamer.h (working copy)
> @@ -264,6 +264,7 @@ enum lto_section_type
> LTO_section_symtab,
> LTO_section_opts,
> LTO_section_cgraph_opt_sum,
> + LTO_section_inline_summary,
> LTO_N_SECTION_TYPES /* Must be last. */
> };
>
>