On Fri, 6 May 2011, Jan Hubicka wrote: > Hi, > this patch implements thunks as real cgraph nodes instead of alias nodes. I > am not > entirely happy about it, but I can't come with anything better. > > The main problem is that thunks can be seen in two ways: > > 1) As alternative entry points into functions > > This is how the existing code attempts to be structured: thunks do not > appear in callgraph, instead of the calgraph edges points to the > functions the thunk are associated with. > > The problem with current code is that none of IPA code nor rest of > compiler is familiar with the concept of alternative entry points. > Consequentely the direct calls to thunks appears in the program > in equivalent way as direct calls to function they are associated > to that consequentely may lead to miscompilations when we decide to > inline and ignore thunk or do ipa-prop. > > As a temporary measure, we declared direct calls to thunk invalid. > This lead to need for devirtualization code to "inline" the thunk > when devirtualizing the call or to not devirtualize. For siple thunks > this is not big deal to do, but for covariant thunks this imply > extra control flow that is something Richi don't like. > Also we now devirtualize "implicitely" via folding lookups into the > vtables. Requiring that code to ponder about thunk adjustments don't > look quite right. > > Next problem is that with LTO we can merge direct call to external > function with thunk and in this case we have to represent the direct > call to thunk. > > To allow direct calls to thunks would mean adding concept of entry > points into callgraph edgess that would mean next pointer to something > that would describe it. Most probably chain of thunk structures: > we do allow and build thunks of thunks. > > We discussed this quite few times on IRC and always this was voted > down as weird. One argument agains is that it will be easy to do > simple wrong code bugs by forgetting about the info hanging on cgraph > edges, since in most cases there is nothing. > > 2) As real functions calling the function they are associated with. > > Because backend don't handle alternative entry points, we really > implement > thunks as small functions that usually tail call into the associated > functions after doing adjustments to THIS. > > Other natural abstraction seems to be handle thunks as real functions. > This is what the patch does. There are several issues with this. > > 1) Not all thunks have bodies that represent in gimple. The variadic > thunks currently don't have any gimple representation. While we can > come with some, there is not that much of value for it because... > 2) We can't expand thunks into RTL. On many archs we have existing > ASM output machinery that leads to better code (and only possible code > for variadic thunks that are not really representable in RTL either). > 3) Thunks are not real functions in C++ ABI sense. They share comdat > groups implicitely and they must be output in specified order to > get proper comdat group signatures > > This patch takes this route and does the compensation where needed. > In particular all IPA passes that worries about gimple bodies needs > to be updated to handle thunks. This is not that hard to do and as > first cut I simply disabled inlining, ipa-prop and cloning on thunks. > We can handle that incrementally. > > The problem of thunks is related to problem of proper representation of > aliases. > Again aliases can be "transparent" that is not having cgraph nodes to them > and all edges going to the final destination or they can be separate nodes. > I originally indended to go for the first case that also has problem with > representing the visibilities of aliases: i.e. depending on alias used, the > edges may or may not be overwritable by the linker, so the alternative entry > point info would need to represent this, too. > > With thunks as separate nodes, I will turn aliases into separate nodes, too > that will have link via ipa-ref infrastructure (i.e. in addition to load/store > and address links we will also have alias links). > > Because IPA passes really care about objects themselves, not the aliases > (i.e. ipa-reference or ipa-pta wants to see the variable and all its aliases > as one object, so wants the inliner or ipa-propagate), we will need to add > some accessor functions that will walk to real destination of the edge > and also walk all real objects referencing the given object skipping the > aliases. > > This approach has the advantage of getting cgraph/varpool closer to symbol > table and making things bit easier at lto-symtab side. > > The patch does basicaly the following: > > 1) turns thunks from alias node into function nodes with node->thunk.thunk_p > flag set > 2) updates verifier, dumping and LTO streaming to handle them correctly > 3) updates way how functions are expanded: thunks can not be handled as > normal function, since they are required to appear at specific place > in the asm file > 4) Adds hack to ipa visibility since C++ frontend gets visibility of thunks > wrong. COMDAT functions do have non-comdat thunks and they are not in > the same comdat group. Fixing this at C++ frontend seems hard becuase > it does use the flags for other purposes > 5) Adds code to ipa-inline-analysis and ipa-prop to make thunks as opaque > as possible, for now > 6) Make ipa-pure-const to see them transparently. > 7) Talks out ipa-cp from idea of redirecting thunks > 8) Updates WHOPR partitioning so thunks are always associated with their > functions > and not split into different partition. > 9) Adds FOR_EACH_FUNCTION_WITH_GIMPLE_BODY/FOR_EACH_DEFINED_FUNCTION > functions to walk cgraph nodes. This is borrowed from my symtab code > where I no longer have the topleve list of cgraph nodes per se, just > list of symbols (and symbols are functions,variables and aliases) > > The patch regstests&bootstraps x86_64-linux. I plan to give it more testing > with Mozilla and other C++ apps and wait for few days for comments before > comitting.
Some comments/questions inline > Honza > > * cgraph.c (cgraph_add_thunk): Create real function node instead > of alias node; finalize it and mark needed/reachale; arrange visibility > to be right and add it into the corresponding same comdat group list. > (dump_cgraph_node): Dump thunks. > * cgraph.h (cgraph_first_defined_function, cgraph_next_defined_function, > cgraph_function_with_gimple_body_p, > cgraph_first_function_with_gimple_body, > cgraph_next_function_with_gimple_body): New functions. > (FOR_EACH_FUNCTION_WITH_GIMPLE_BODY, FOR_EACH_DEFINED_FUNCTION): > New macros. > * ipa-cp.c (ipcp_need_redirect_p): Thunks can't be redirected. > (ipcp_generate_summary): Use FOR_EACH_FUNCTION_WITH_GIMPLE_BODY. > * cgraphunit.c (cgraph_finalize_function): Only look into possible > devirtualization when optimizing. > (verify_cgraph_node): Verify thunks. > (cgraph_analyze_function): Analyze thunks. > (cgraph_mark_functions_to_output): Output thunks only in combination > with function they are assigned to. > (assemble_thunk): Turn thunk into non-thunk; don't try to turn > alias into normal node. > (assemble_thunks): New functoin. > (cgraph_expand_function): Use it. > * lto-cgraph.c (lto_output_node): Stream thunks. > (input_overwrite_node): Stream in thunks. > * ipa-pure-const.c (analyze_function): Thunks do nothing interesting. > * lto-streamer-out.c (lto_output): Do not try to output thunk's body. > * ipa-inline.c (inline_small_functions): Use FOR_EACH_DEFINED_FUNCTION. > * ipa-inline-analysis.c (compute_inline_parameters): "Analyze" thunks. > (inline_analyze_function): Do not care about thunk jump functions. > (inline_generate_summary):Use FOR_EACH_DEFINED_FUNCTION. > * ipa-prop.c (ipa_prop_write_jump_functions): Use > cgraph_function_with_gimple_body_p. > * passes.c (do_per_function_toporder): Use > cgraph_function_with_gimple_body_p. > (execute_one_pass);Use FOR_EACH_FUNCTION_WITH_GIMPLE_BODY. > (ipa_write_summaries): Use cgraph_function_with_gimple_body_p. > (function_called_by_processed_nodes_p): Likewise. > > * lto.c (lto_materialize_function): Use > cgraph_function_with_gimple_body_p. > (add_cgraph_node_to_partition): Do not re-add items to partition; > handle thunks. > (add_varpool_node_to_partition): Do not re-add items to partition. > > Index: cgraph.c > =================================================================== > *** cgraph.c (revision 173251) > --- cgraph.c (working copy) > *************** cgraph_same_body_alias (struct cgraph_no > *** 595,608 **** > See comments in thunk_adjust for detail on the parameters. */ > > struct cgraph_node * > ! cgraph_add_thunk (struct cgraph_node *decl_node, tree alias, tree decl, > bool this_adjusting, > HOST_WIDE_INT fixed_offset, HOST_WIDE_INT virtual_value, > tree virtual_offset, > tree real_alias) > { > ! struct cgraph_node *node = cgraph_get_node (alias); > > if (node) > { > gcc_assert (node->local.finalized); > --- 595,610 ---- > See comments in thunk_adjust for detail on the parameters. */ > > struct cgraph_node * > ! cgraph_add_thunk (struct cgraph_node *decl_node ATTRIBUTE_UNUSED, > ! tree alias, tree decl, > bool this_adjusting, > HOST_WIDE_INT fixed_offset, HOST_WIDE_INT virtual_value, > tree virtual_offset, > tree real_alias) > { > ! struct cgraph_node *node; > > + node = cgraph_get_node (alias); > if (node) > { > gcc_assert (node->local.finalized); > *************** cgraph_add_thunk (struct cgraph_node *de > *** 610,617 **** > cgraph_remove_node (node); > } > > ! node = cgraph_same_body_alias_1 (decl_node, alias, decl); > ! gcc_assert (node); > gcc_checking_assert (!virtual_offset > || tree_int_cst_equal (virtual_offset, > size_int (virtual_value))); > --- 612,618 ---- > cgraph_remove_node (node); > } > > ! node = cgraph_create_node (alias); > gcc_checking_assert (!virtual_offset > || tree_int_cst_equal (virtual_offset, > size_int (virtual_value))); > *************** cgraph_add_thunk (struct cgraph_node *de > *** 621,626 **** > --- 622,636 ---- > node->thunk.virtual_offset_p = virtual_offset != NULL; > node->thunk.alias = real_alias; > node->thunk.thunk_p = true; > + node->local.finalized = true; > + > + if (cgraph_decide_is_function_needed (node, decl)) > + cgraph_mark_needed_node (node); > + > + if ((TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl)) > + || (DECL_VIRTUAL_P (decl) > + && optimize && (DECL_COMDAT (decl) || DECL_EXTERNAL (decl)))) && optimize ? That somehow looks weird. > + cgraph_mark_reachable_node (node); > return node; > } > > *************** dump_cgraph_node (FILE *f, struct cgraph > *** 1874,1880 **** > if (node->only_called_at_exit) > fprintf (f, " only_called_at_exit"); > > ! fprintf (f, "\n called by: "); > for (edge = node->callers; edge; edge = edge->next_caller) > { > fprintf (f, "%s/%i ", cgraph_node_name (edge->caller), > --- 1884,1907 ---- > if (node->only_called_at_exit) > fprintf (f, " only_called_at_exit"); > > ! fprintf (f, "\n"); > ! > ! if (node->thunk.thunk_p) > ! { > ! if (node->thunk.thunk_p) > ! { > ! fprintf (f, " thunk of %s (asm: %s) fixed offset %i virtual value %i > has " > ! "virtual offset %i)\n", > ! lang_hooks.decl_printable_name (node->thunk.alias, 2), > ! IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->thunk.alias)), > ! (int)node->thunk.fixed_offset, > ! (int)node->thunk.virtual_value, > ! (int)node->thunk.virtual_offset_p); > ! } > ! } > ! > ! fprintf (f, " called by: "); > ! > for (edge = node->callers; edge; edge = edge->next_caller) > { > fprintf (f, "%s/%i ", cgraph_node_name (edge->caller), > *************** dump_cgraph_node (FILE *f, struct cgraph > *** 1926,1945 **** > if (node->same_body) > { > struct cgraph_node *n; > ! fprintf (f, " aliases & thunks:"); > for (n = node->same_body; n; n = n->next) > { > fprintf (f, " %s/%i", cgraph_node_name (n), n->uid); > - if (n->thunk.thunk_p) > - { > - fprintf (f, " (thunk of %s fixed offset %i virtual value %i has " > - "virtual offset %i", > - lang_hooks.decl_printable_name (n->thunk.alias, 2), > - (int)n->thunk.fixed_offset, > - (int)n->thunk.virtual_value, > - (int)n->thunk.virtual_offset_p); > - fprintf (f, ")"); > - } > if (DECL_ASSEMBLER_NAME_SET_P (n->decl)) > fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME > (n->decl))); > } > --- 1953,1962 ---- > if (node->same_body) > { > struct cgraph_node *n; > ! fprintf (f, " aliases:"); > for (n = node->same_body; n; n = n->next) > { > fprintf (f, " %s/%i", cgraph_node_name (n), n->uid); > if (DECL_ASSEMBLER_NAME_SET_P (n->decl)) > fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME > (n->decl))); > } > Index: cgraph.h > =================================================================== > *** cgraph.h (revision 173251) > --- cgraph.h (working copy) > *************** varpool_next_static_initializer (struct > *** 715,720 **** > --- 715,793 ---- > for ((node) = varpool_first_static_initializer (); (node); \ > (node) = varpool_next_static_initializer (node)) > > + /* Return first function with body defined. */ > + static inline struct cgraph_node * > + cgraph_first_defined_function (void) > + { > + struct cgraph_node *node; > + for (node = cgraph_nodes; node; node = node->next) > + { > + if (node->analyzed) > + return node; > + } > + return NULL; > + } > + > + /* Return next reachable static variable with initializer after NODE. */ > + static inline struct cgraph_node * > + cgraph_next_defined_function (struct cgraph_node *node) > + { > + for (node = node->next; node; node = node->next) > + { > + if (node->analyzed) > + return node; > + } > + return NULL; > + } > + > + /* Walk all functions with body defined. */ > + #define FOR_EACH_DEFINED_FUNCTION(node) \ > + for ((node) = cgraph_first_defined_function (); (node); \ > + (node) = cgraph_next_defined_function (node)) > + > + > + /* Return true when NODE is a function with Gimple body defined > + in current unit. Functions can also be define externally or they > + can be thunks with no Gimple representation. > + > + Note that at WPA stage, the function body may not be present in memory. > */ > + > + static inline bool > + cgraph_function_with_gimple_body_p (struct cgraph_node *node) > + { > + return node->analyzed && !node->thunk.thunk_p; > + } > + > + /* Return first function with body defined. */ > + static inline struct cgraph_node * > + cgraph_first_function_with_gimple_body (void) > + { > + struct cgraph_node *node; > + for (node = cgraph_nodes; node; node = node->next) > + { > + if (cgraph_function_with_gimple_body_p (node)) > + return node; > + } > + return NULL; > + } > + > + /* Return next reachable static variable with initializer after NODE. */ > + static inline struct cgraph_node * > + cgraph_next_function_with_gimple_body (struct cgraph_node *node) > + { > + for (node = node->next; node; node = node->next) > + { > + if (cgraph_function_with_gimple_body_p (node)) > + return node; > + } > + return NULL; > + } > + > + /* Walk all functions with body defined. */ > + #define FOR_EACH_FUNCTION_WITH_GIMPLE_BODY(node) \ > + for ((node) = cgraph_first_function_with_gimple_body (); (node); \ > + (node) = cgraph_next_function_with_gimple_body (node)) > + > /* Create a new static variable of type TYPE. */ > tree add_new_static_var (tree type); > > Index: ipa-cp.c > =================================================================== > *** ipa-cp.c (revision 173251) > --- ipa-cp.c (working copy) > *************** ipcp_need_redirect_p (struct cgraph_edge > *** 951,956 **** > --- 951,960 ---- > if (!n_cloning_candidates) > return false; > > + /* We can't redirect anything in thunks, yet. */ > + if (cs->caller->thunk.thunk_p) > + return true; > + > if ((orig = ipcp_get_orig_node (node)) != NULL) > node = orig; > if (ipcp_get_orig_node (cs->caller)) > *************** ipcp_generate_summary (void) > *** 1508,1515 **** > fprintf (dump_file, "\nIPA constant propagation start:\n"); > ipa_register_cgraph_hooks (); > > ! for (node = cgraph_nodes; node; node = node->next) > ! if (node->analyzed) > { > /* Unreachable nodes should have been eliminated before ipcp. */ > gcc_assert (node->needed || node->reachable); > --- 1512,1520 ---- > fprintf (dump_file, "\nIPA constant propagation start:\n"); > ipa_register_cgraph_hooks (); > > ! /* FIXME: We could propagate through thunks happily and we could be > ! even able to clone them, if needed. Do that later. */ > ! FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > { > /* Unreachable nodes should have been eliminated before ipcp. */ > gcc_assert (node->needed || node->reachable); > Index: cgraphunit.c > =================================================================== > *** cgraphunit.c (revision 173251) > --- cgraphunit.c (working copy) > *************** cgraph_finalize_function (tree decl, boo > *** 370,376 **** > to those so we need to analyze them. > FIXME: We should introduce may edges for this purpose and update > their handling in unreachable function removal and inliner too. */ > ! || (DECL_VIRTUAL_P (decl) && (DECL_COMDAT (decl) || DECL_EXTERNAL > (decl)))) > cgraph_mark_reachable_node (node); > > /* If we've not yet emitted decl, tell the debug info about it. */ > --- 370,377 ---- > to those so we need to analyze them. > FIXME: We should introduce may edges for this purpose and update > their handling in unreachable function removal and inliner too. */ > ! || (DECL_VIRTUAL_P (decl) > ! && optimize && (DECL_COMDAT (decl) || DECL_EXTERNAL (decl)))) again? > cgraph_mark_reachable_node (node); > > /* If we've not yet emitted decl, tell the debug info about it. */ > *************** verify_cgraph_node (struct cgraph_node * > *** 624,633 **** > while (n != node); > } > > ! if (node->analyzed && gimple_has_body_p (node->decl) > ! && !TREE_ASM_WRITTEN (node->decl) > ! && (!DECL_EXTERNAL (node->decl) || node->global.inlined_to) > ! && !flag_wpa) > { > if (this_cfun->cfg) > { > --- 625,652 ---- > while (n != node); > } > > ! if (node->analyzed && node->thunk.thunk_p) > ! { > ! if (!node->callees) > ! { > ! error ("No edge out of thunk node"); > ! error_found = true; > ! } > ! else if (node->callees->next_callee) > ! { > ! error ("More than one edge out of thunk node"); > ! error_found = true; > ! } > ! if (gimple_has_body_p (node->decl)) > ! { > ! error ("Thunk is not supposed to have body"); > ! error_found = true; Is that true for targets that do not have asm thunks? Thus, did we get rid of the path in the C++ FE that emits thunks as regular GENERIC functions? > ! } > ! } > ! else if (node->analyzed && gimple_has_body_p (node->decl) > ! && !TREE_ASM_WRITTEN (node->decl) > ! && (!DECL_EXTERNAL (node->decl) || node->global.inlined_to) > ! && !flag_wpa) > { > if (this_cfun->cfg) > { > *************** verify_cgraph_node (struct cgraph_node * > *** 656,663 **** > } > if (!e->indirect_unknown_callee) > { > - struct cgraph_node *n; > - > if (e->callee->same_body_alias) > { > error ("edge points to same body alias:"); > --- 675,680 ---- > *************** verify_cgraph_node (struct cgraph_node * > *** 678,693 **** > debug_tree (decl); > error_found = true; > } > - else if (decl > - && (n = cgraph_get_node_or_alias (decl)) > - && (n->same_body_alias > - && n->thunk.thunk_p)) > - { > - error ("a call to thunk improperly represented " > - "in the call graph:"); > - cgraph_debug_gimple_stmt (this_cfun, stmt); > - error_found = true; > - } > } > else if (decl) > { > --- 695,700 ---- > *************** cgraph_analyze_function (struct cgraph_n > *** 780,802 **** > tree save = current_function_decl; > tree decl = node->decl; > > ! current_function_decl = decl; > ! push_cfun (DECL_STRUCT_FUNCTION (decl)); > > ! assign_assembler_name_if_neeeded (node->decl); > > ! /* Make sure to gimplify bodies only once. During analyzing a > ! function we lower it, which will require gimplified nested > ! functions, so we can end up here with an already gimplified > ! body. */ > ! if (!gimple_body (decl)) > ! gimplify_function_tree (decl); > ! dump_function (TDI_generic, decl); > > ! cgraph_lower_function (node); > node->analyzed = true; > > - pop_cfun (); > current_function_decl = save; > } > > --- 787,817 ---- > tree save = current_function_decl; > tree decl = node->decl; > > ! if (node->thunk.thunk_p) > ! { > ! cgraph_create_edge (node, cgraph_get_node (node->thunk.alias), > ! NULL, 0, CGRAPH_FREQ_BASE); Ick ;) Why not do this at thunk cgraph node creation time? > ! } > ! else > ! { > ! current_function_decl = decl; > ! push_cfun (DECL_STRUCT_FUNCTION (decl)); > > ! assign_assembler_name_if_neeeded (node->decl); > > ! /* Make sure to gimplify bodies only once. During analyzing a > ! function we lower it, which will require gimplified nested > ! functions, so we can end up here with an already gimplified > ! body. */ > ! if (!gimple_body (decl)) > ! gimplify_function_tree (decl); > ! dump_function (TDI_generic, decl); > > ! cgraph_lower_function (node); > ! pop_cfun (); > ! } > node->analyzed = true; > > current_function_decl = save; > } > > *************** cgraph_analyze_functions (void) > *** 969,975 **** > /* ??? It is possible to create extern inline function and later using > weak alias attribute to kill its body. See > gcc.c-torture/compile/20011119-1.c */ > ! if (!DECL_STRUCT_FUNCTION (decl)) > { > cgraph_reset_node (node); > continue; > --- 984,991 ---- > /* ??? It is possible to create extern inline function and later using > weak alias attribute to kill its body. See > gcc.c-torture/compile/20011119-1.c */ > ! if (!DECL_STRUCT_FUNCTION (decl) > ! && !node->thunk.thunk_p) > { > cgraph_reset_node (node); > continue; > *************** cgraph_analyze_functions (void) > *** 1031,1040 **** > tree decl = node->decl; > next = node->next; > > ! if (node->local.finalized && !gimple_has_body_p (decl)) > cgraph_reset_node (node); > > ! if (!node->reachable && gimple_has_body_p (decl)) > { > if (cgraph_dump_file) > fprintf (cgraph_dump_file, " %s", cgraph_node_name (node)); > --- 1047,1058 ---- > tree decl = node->decl; > next = node->next; > > ! if (node->local.finalized && !gimple_has_body_p (decl) > ! && !node->thunk.thunk_p) > cgraph_reset_node (node); > > ! if (!node->reachable > ! && (gimple_has_body_p (decl) || node->thunk.thunk_p)) > { > if (cgraph_dump_file) > fprintf (cgraph_dump_file, " %s", cgraph_node_name (node)); > *************** cgraph_analyze_functions (void) > *** 1043,1049 **** > } > else > node->next_needed = NULL; > ! gcc_assert (!node->local.finalized || gimple_has_body_p (decl)); > gcc_assert (node->analyzed == node->local.finalized); > } > if (cgraph_dump_file) > --- 1061,1068 ---- > } > else > node->next_needed = NULL; > ! gcc_assert (!node->local.finalized || node->thunk.thunk_p > ! || gimple_has_body_p (decl)); > gcc_assert (node->analyzed == node->local.finalized); > } > if (cgraph_dump_file) > *************** cgraph_mark_functions_to_output (void) > *** 1132,1137 **** > --- 1151,1157 ---- > always inlined, as well as those that are reachable from > outside the current compilation unit. */ > if (node->analyzed > + && !node->thunk.thunk_p > && !node->global.inlined_to > && (!cgraph_only_called_directly_p (node) > || (e && node->reachable)) > *************** cgraph_mark_functions_to_output (void) > *** 1145,1151 **** > for (next = node->same_comdat_group; > next != node; > next = next->same_comdat_group) > ! next->process = 1; > } > } > else if (node->same_comdat_group) > --- 1165,1172 ---- > for (next = node->same_comdat_group; > next != node; > next = next->same_comdat_group) > ! if (!node->thunk.thunk_p) > ! next->process = 1; > } > } > else if (node->same_comdat_group) > *************** assemble_thunk (struct cgraph_node *node > *** 1406,1411 **** > --- 1427,1433 ---- > free_after_compilation (cfun); > set_cfun (NULL); > TREE_ASM_WRITTEN (thunk_fndecl) = 1; > + node->thunk.thunk_p = false; Hmm. Doesn't that confuse regular passes who now see the thunk no longer as thunk? > } > else > { > *************** assemble_thunk (struct cgraph_node *node > *** 1530,1544 **** > delete_unreachable_blocks (); > update_ssa (TODO_update_ssa); > > - cgraph_remove_same_body_alias (node); > /* Since we want to emit the thunk, we explicitly mark its name as > referenced. */ > cgraph_add_new_function (thunk_fndecl, true); > bitmap_obstack_release (NULL); > } > current_function_decl = NULL; > } > > /* Expand function specified by NODE. */ > > static void > --- 1552,1587 ---- > delete_unreachable_blocks (); > update_ssa (TODO_update_ssa); > > /* Since we want to emit the thunk, we explicitly mark its name as > referenced. */ > + node->thunk.thunk_p = false; > + cgraph_node_remove_callees (node); > cgraph_add_new_function (thunk_fndecl, true); > bitmap_obstack_release (NULL); > } > current_function_decl = NULL; > } > > + > + /* Assemble thunks asociated to NODE. */ > + > + static void > + assemble_thunks (struct cgraph_node *node) > + { > + struct cgraph_edge *e; > + for (e = node->callers; e;) > + if (e->caller->thunk.thunk_p) > + { > + struct cgraph_node *thunk = e->caller; > + > + e = e->next_caller; > + assemble_thunks (thunk); > + assemble_thunk (thunk); > + } > + else > + e = e->next_caller; > + } > + > /* Expand function specified by NODE. */ > > static void > *************** cgraph_expand_function (struct cgraph_no > *** 1566,1578 **** > if (!alias->thunk.thunk_p) > assemble_alias (alias->decl, > DECL_ASSEMBLER_NAME (alias->thunk.alias)); > - else > - assemble_thunk (alias); > } > node->alias = saved_alias; > cgraph_process_new_functions (); > } > > gcc_assert (node->lowered); > > /* Generate RTL for the body of DECL. */ > --- 1609,1620 ---- > if (!alias->thunk.thunk_p) > assemble_alias (alias->decl, > DECL_ASSEMBLER_NAME (alias->thunk.alias)); > } > node->alias = saved_alias; > cgraph_process_new_functions (); > } > > + assemble_thunks (node); > gcc_assert (node->lowered); > > /* Generate RTL for the body of DECL. */ > *************** cgraph_output_in_order (void) > *** 1688,1694 **** > > for (pf = cgraph_nodes; pf; pf = pf->next) > { > ! if (pf->process) > { > i = pf->order; > gcc_assert (nodes[i].kind == ORDER_UNDEFINED); > --- 1730,1736 ---- > > for (pf = cgraph_nodes; pf; pf = pf->next) > { > ! if (pf->process && !pf->thunk.thunk_p) > { > i = pf->order; > gcc_assert (nodes[i].kind == ORDER_UNDEFINED); > Index: lto-cgraph.c > =================================================================== > *** lto-cgraph.c (revision 173251) > --- lto-cgraph.c (working copy) > *************** lto_output_node (struct lto_simple_outpu > *** 502,510 **** > --- 502,525 ---- > bp_pack_value (&bp, node->frequency, 2); > bp_pack_value (&bp, node->only_called_at_startup, 1); > bp_pack_value (&bp, node->only_called_at_exit, 1); > + bp_pack_value (&bp, node->thunk.thunk_p, 1); > lto_output_bitpack (&bp); > lto_output_uleb128_stream (ob->main_stream, node->resolution); > > + if (node->thunk.thunk_p) > + { > + lto_output_uleb128_stream > + (ob->main_stream, > + 1 + (node->thunk.this_adjusting != 0) * 2 > + + (node->thunk.virtual_offset_p != 0) * 4); > + lto_output_uleb128_stream (ob->main_stream, > + node->thunk.fixed_offset); > + lto_output_uleb128_stream (ob->main_stream, > + node->thunk.virtual_value); > + lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > + node->thunk.alias); > + } > + > if (node->same_body) > { > struct cgraph_node *alias; > *************** lto_output_node (struct lto_simple_outpu > *** 516,540 **** > { > lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > alias->decl); > ! if (alias->thunk.thunk_p) > ! { > ! lto_output_uleb128_stream > ! (ob->main_stream, > ! 1 + (alias->thunk.this_adjusting != 0) * 2 > ! + (alias->thunk.virtual_offset_p != 0) * 4); > ! lto_output_uleb128_stream (ob->main_stream, > ! alias->thunk.fixed_offset); > ! lto_output_uleb128_stream (ob->main_stream, > ! alias->thunk.virtual_value); > ! lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > ! alias->thunk.alias); > ! } > ! else > ! { > ! lto_output_uleb128_stream (ob->main_stream, 0); > ! lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > ! alias->thunk.alias); > ! } > gcc_assert (cgraph_get_node (alias->thunk.alias) == node); > lto_output_uleb128_stream (ob->main_stream, alias->resolution); > alias = alias->previous; > --- 531,538 ---- > { > lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > alias->decl); > ! lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > ! alias->thunk.alias); > gcc_assert (cgraph_get_node (alias->thunk.alias) == node); > lto_output_uleb128_stream (ob->main_stream, alias->resolution); > alias = alias->previous; > *************** input_overwrite_node (struct lto_file_de > *** 947,952 **** > --- 945,951 ---- > node->frequency = (enum node_frequency)bp_unpack_value (bp, 2); > node->only_called_at_startup = bp_unpack_value (bp, 1); > node->only_called_at_exit = bp_unpack_value (bp, 1); > + node->thunk.thunk_p = bp_unpack_value (bp, 1); > node->resolution = resolution; > } > > *************** input_node (struct lto_file_decl_data *f > *** 1031,1064 **** > /* Store a reference for now, and fix up later to be a pointer. */ > node->same_comdat_group = (cgraph_node_ptr) (intptr_t) ref2; > > same_body_count = lto_input_uleb128 (ib); > while (same_body_count-- > 0) > { > ! tree alias_decl; > ! int type; > struct cgraph_node *alias; > decl_index = lto_input_uleb128 (ib); > alias_decl = lto_file_decl_data_get_fn_decl (file_data, decl_index); > ! type = lto_input_uleb128 (ib); > ! if (!type) > ! { > ! tree real_alias; > ! decl_index = lto_input_uleb128 (ib); > ! real_alias = lto_file_decl_data_get_fn_decl (file_data, decl_index); > ! alias = cgraph_same_body_alias (node, alias_decl, real_alias); > ! } > ! else > ! { > ! HOST_WIDE_INT fixed_offset = lto_input_uleb128 (ib); > ! HOST_WIDE_INT virtual_value = lto_input_uleb128 (ib); > ! tree real_alias; > ! decl_index = lto_input_uleb128 (ib); > ! real_alias = lto_file_decl_data_get_fn_decl (file_data, decl_index); > ! alias = cgraph_add_thunk (node, alias_decl, fn_decl, type & 2, > fixed_offset, > ! virtual_value, > ! (type & 4) ? size_int (virtual_value) : > NULL_TREE, > ! real_alias); > ! } > gcc_assert (alias); > alias->resolution = (enum > ld_plugin_symbol_resolution)lto_input_uleb128 (ib); > } > --- 1030,1062 ---- > /* Store a reference for now, and fix up later to be a pointer. */ > node->same_comdat_group = (cgraph_node_ptr) (intptr_t) ref2; > > + if (node->thunk.thunk_p) > + { > + int type = lto_input_uleb128 (ib); > + HOST_WIDE_INT fixed_offset = lto_input_uleb128 (ib); > + HOST_WIDE_INT virtual_value = lto_input_uleb128 (ib); > + tree real_alias; > + > + decl_index = lto_input_uleb128 (ib); > + real_alias = lto_file_decl_data_get_fn_decl (file_data, decl_index); > + node->thunk.fixed_offset = fixed_offset; > + node->thunk.this_adjusting = (type & 2); > + node->thunk.virtual_value = virtual_value; > + node->thunk.virtual_offset_p = (type & 4); > + node->thunk.alias = real_alias; > + } > + > same_body_count = lto_input_uleb128 (ib); > while (same_body_count-- > 0) > { > ! tree alias_decl, real_alias; > struct cgraph_node *alias; > + > decl_index = lto_input_uleb128 (ib); > alias_decl = lto_file_decl_data_get_fn_decl (file_data, decl_index); > ! decl_index = lto_input_uleb128 (ib); > ! real_alias = lto_file_decl_data_get_fn_decl (file_data, decl_index); > ! alias = cgraph_same_body_alias (node, alias_decl, real_alias); > gcc_assert (alias); > alias->resolution = (enum > ld_plugin_symbol_resolution)lto_input_uleb128 (ib); > } > Index: ipa-pure-const.c > =================================================================== > *** ipa-pure-const.c (revision 173251) > --- ipa-pure-const.c (working copy) > *************** analyze_function (struct cgraph_node *fn > *** 731,736 **** > --- 731,746 ---- > l->looping_previously_known = true; > l->looping = false; > l->can_throw = false; > + state_from_flags (&l->state_previously_known, > &l->looping_previously_known, > + flags_from_decl_or_type (fn->decl), > + cgraph_node_cannot_return (fn)); > + > + if (fn->thunk.thunk_p) > + { > + /* Thunk gets propagated through, so nothing interesting happens. */ > + gcc_assert (ipa); > + return l; > + } > > if (dump_file) > { > *************** end: > *** 799,807 **** > > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, " checking previously known:"); > - state_from_flags (&l->state_previously_known, > &l->looping_previously_known, > - flags_from_decl_or_type (fn->decl), > - cgraph_node_cannot_return (fn)); > > better_state (&l->pure_const_state, &l->looping, > l->state_previously_known, > --- 809,814 ---- > Index: lto-streamer-out.c > =================================================================== > *** lto-streamer-out.c (revision 173251) > --- lto-streamer-out.c (working copy) > *************** lto_output (cgraph_node_set set, varpool > *** 2197,2203 **** > for (i = 0; i < n_nodes; i++) > { > node = lto_cgraph_encoder_deref (encoder, i); > ! if (lto_cgraph_encoder_encode_body_p (encoder, node)) > { > #ifdef ENABLE_CHECKING > gcc_assert (!bitmap_bit_p (output, DECL_UID (node->decl))); > --- 2197,2204 ---- > for (i = 0; i < n_nodes; i++) > { > node = lto_cgraph_encoder_deref (encoder, i); > ! if (lto_cgraph_encoder_encode_body_p (encoder, node) > ! && !node->thunk.thunk_p) > { > #ifdef ENABLE_CHECKING > gcc_assert (!bitmap_bit_p (output, DECL_UID (node->decl))); > Index: ipa-inline.c > =================================================================== > *** ipa-inline.c (revision 173251) > --- ipa-inline.c (working copy) > *************** inline_small_functions (void) > *** 1177,1185 **** > max_count = 0; > initialize_growth_caches (); > > ! for (node = cgraph_nodes; node; node = node->next) > ! if (node->analyzed > ! && !node->global.inlined_to) > { > struct inline_summary *info = inline_summary (node); > > --- 1177,1184 ---- > max_count = 0; > initialize_growth_caches (); > > ! FOR_EACH_DEFINED_FUNCTION (node) > ! if (!node->global.inlined_to) > { > struct inline_summary *info = inline_summary (node); > > *************** inline_small_functions (void) > *** 1197,1205 **** > > /* Populate the heeap with all edges we might inline. */ > > ! for (node = cgraph_nodes; node; node = node->next) > ! if (node->analyzed > ! && !node->global.inlined_to) > { > if (dump_file) > fprintf (dump_file, "Enqueueing calls of %s/%i.\n", > --- 1196,1203 ---- > > /* Populate the heeap with all edges we might inline. */ > > ! FOR_EACH_DEFINED_FUNCTION (node) > ! if (!node->global.inlined_to) > { > if (dump_file) > fprintf (dump_file, "Enqueueing calls of %s/%i.\n", > Index: ipa.c > =================================================================== > *** ipa.c (revision 173251) > --- ipa.c (working copy) > *************** function_and_variable_visibility (bool w > *** 877,883 **** > --- 877,922 ---- > segfault though. */ > dissolve_same_comdat_group_list (node); > } > + if (node->thunk.thunk_p) > + { > + struct cgraph_node *decl_node = node; > + > + while (decl_node->thunk.thunk_p) > + decl_node = decl_node->callees->callee; > + > + /* Thunks have the same visibility as function they are attached to. > + For some reason C++ frontend don't seem to care. I.e. in > + g++.dg/torture/pr41257-2.C the thunk is not comdat while function > + it is attached to is. > + > + We also need to arrange the thunk into the same comdat group as > + the function it reffers to. */ > + if (DECL_COMDAT (decl_node->decl)) > + { > + DECL_COMDAT (node->decl) = 1; > + DECL_COMDAT_GROUP (node->decl) = DECL_COMDAT_GROUP > (decl_node->decl); > + if (!node->same_comdat_group) > + { > + > + node->same_comdat_group = decl_node; > + if (!decl_node->same_comdat_group) > + decl_node->same_comdat_group = node; > + else > + { > + struct cgraph_node *n; > + for (n = decl_node->same_comdat_group; > + n->same_comdat_group != decl_node; > + n = n->same_comdat_group) > + ; > + n->same_comdat_group = decl_node; > + } > + } > + } > + if (DECL_EXTERNAL (decl_node->decl)) > + DECL_EXTERNAL (node->decl) = 1; That's indeed remarkably ugly and I hope the C++ FE people can do sth about this ... > + } > node->local.local = cgraph_local_node_p (node); > + > } > for (vnode = varpool_nodes; vnode; vnode = vnode->next) > { > Index: ipa-inline-analysis.c > =================================================================== > *** ipa-inline-analysis.c (revision 173251) > --- ipa-inline-analysis.c (working copy) > *************** compute_inline_parameters (struct cgraph > *** 1443,1448 **** > --- 1443,1465 ---- > > info = inline_summary (node); > > + /* FIXME: Thunks are inlinable, but tree-inline don't know how to do that. > + Once this happen, we will need to more curefully predict call carefully > + statement size. */ > + if (node->thunk.thunk_p) > + { > + struct inline_edge_summary *es = inline_edge_summary (node->callees); > + struct predicate t = true_predicate (); > + > + info->inlinable = info->versionable = 0; > + node->callees->call_stmt_cannot_inline_p = true; > + node->local.can_change_signature = false; > + es->call_stmt_time = 1; > + es->call_stmt_size = 1; > + account_size_time (info, 0, 0, &t); > + return; > + } > + > /* Estimate the stack size for the function if we're optimizing. */ > self_stack_size = optimize ? estimated_stack_frame_size (node) : 0; > info->estimated_self_stack_size = self_stack_size; > *************** inline_analyze_function (struct cgraph_n > *** 2027,2033 **** > cgraph_node_name (node), node->uid); > /* FIXME: We should remove the optimize check after we ensure we never run > IPA passes when not optimizing. */ > ! if (flag_indirect_inlining && optimize) > inline_indirect_intraprocedural_analysis (node); > compute_inline_parameters (node, false); > > --- 2044,2050 ---- > cgraph_node_name (node), node->uid); > /* FIXME: We should remove the optimize check after we ensure we never run > IPA passes when not optimizing. */ > ! if (flag_indirect_inlining && optimize && !node->thunk.thunk_p) > inline_indirect_intraprocedural_analysis (node); > compute_inline_parameters (node, false); > > *************** inline_generate_summary (void) > *** 2058,2065 **** > if (flag_indirect_inlining) > ipa_register_cgraph_hooks (); > > ! for (node = cgraph_nodes; node; node = node->next) > ! if (node->analyzed) > inline_analyze_function (node); > } > > --- 2075,2081 ---- > if (flag_indirect_inlining) > ipa_register_cgraph_hooks (); > > ! FOR_EACH_DEFINED_FUNCTION (node) > inline_analyze_function (node); > } > > Index: lto/lto.c > =================================================================== > *** lto/lto.c (revision 173251) > --- lto/lto.c (working copy) > *************** lto_materialize_function (struct cgraph_ > *** 147,155 **** > decl = node->decl; > /* Read in functions with body (analyzed nodes) > and also functions that are needed to produce virtual clones. */ > ! if (node->analyzed || has_analyzed_clone_p (node)) > { > ! /* Clones don't need to be read. */ > if (node->clone_of) > return; > > --- 147,155 ---- > decl = node->decl; > /* Read in functions with body (analyzed nodes) > and also functions that are needed to produce virtual clones. */ > ! if (cgraph_function_with_gimple_body_p (node) || has_analyzed_clone_p > (node)) > { > ! /* Clones and thunks don't need to be read. */ > if (node->clone_of) > return; > > *************** static void > *** 1183,1188 **** > --- 1183,1194 ---- > add_cgraph_node_to_partition (ltrans_partition part, struct cgraph_node > *node) > { > struct cgraph_edge *e; > + cgraph_node_set_iterator csi; > + > + /* If NODE is already there, we have nothing to do. */ > + csi = cgraph_node_set_find (part->cgraph_set, node); > + if (!csi_end_p (csi)) > + return; > > part->insns += inline_summary (node)->self_size; > > *************** add_cgraph_node_to_partition (ltrans_par > *** 1197,1202 **** > --- 1203,1215 ---- > > cgraph_node_set_add (part->cgraph_set, node); > > + /* Thunks always must go along with function they reffer to. */ > + if (node->thunk.thunk_p) > + add_cgraph_node_to_partition (part, node->callees->callee); > + for (e = node->callers; e; e = e->next_caller) > + if (e->caller->thunk.thunk_p) > + add_cgraph_node_to_partition (part, e->caller); > + > for (e = node->callees; e; e = e->next_callee) > if ((!e->inline_failed || DECL_COMDAT (e->callee->decl)) > && !cgraph_node_in_set_p (e->callee, part->cgraph_set)) > *************** add_cgraph_node_to_partition (ltrans_par > *** 1214,1219 **** > --- 1227,1239 ---- > static void > add_varpool_node_to_partition (ltrans_partition part, struct varpool_node > *vnode) > { > + varpool_node_set_iterator vsi; > + > + /* If NODE is already there, we have nothing to do. */ > + vsi = varpool_node_set_find (part->varpool_set, vnode); > + if (!vsi_end_p (vsi)) > + return; > + > varpool_node_set_add (part->varpool_set, vnode); > > if (vnode->aux) > Index: ipa-prop.c > =================================================================== > *** ipa-prop.c (revision 173251) > --- ipa-prop.c (working copy) > *************** ipa_prop_write_jump_functions (cgraph_no > *** 2898,2904 **** > for (csi = csi_start (set); !csi_end_p (csi); csi_next (&csi)) > { > node = csi_node (csi); > ! if (node->analyzed && IPA_NODE_REF (node) != NULL) > ipa_write_node_info (ob, node); > } > lto_output_1_stream (ob->main_stream, 0); > --- 2898,2905 ---- > for (csi = csi_start (set); !csi_end_p (csi); csi_next (&csi)) > { > node = csi_node (csi); > ! if (cgraph_function_with_gimple_body_p (node) > ! && IPA_NODE_REF (node) != NULL) > ipa_write_node_info (ob, node); > } > lto_output_1_stream (ob->main_stream, 0); > Index: passes.c > =================================================================== > *** passes.c (revision 173251) > --- passes.c (working copy) > *************** do_per_function_toporder (void (*callbac > *** 1135,1141 **** > /* Allow possibly removed nodes to be garbage collected. */ > order[i] = NULL; > node->process = 0; > ! if (node->analyzed) > { > push_cfun (DECL_STRUCT_FUNCTION (node->decl)); > current_function_decl = node->decl; > --- 1135,1141 ---- > /* Allow possibly removed nodes to be garbage collected. */ > order[i] = NULL; > node->process = 0; > ! if (cgraph_function_with_gimple_body_p (node)) > { > push_cfun (DECL_STRUCT_FUNCTION (node->decl)); > current_function_decl = node->decl; > *************** execute_one_pass (struct opt_pass *pass) > *** 1581,1590 **** > if (pass->type == IPA_PASS) > { > struct cgraph_node *node; > ! for (node = cgraph_nodes; node; node = node->next) > ! if (node->analyzed) > ! VEC_safe_push (ipa_opt_pass, heap, node->ipa_transforms_to_apply, > ! (struct ipa_opt_pass_d *)pass); > } > > if (!current_function_decl) > --- 1581,1589 ---- > if (pass->type == IPA_PASS) > { > struct cgraph_node *node; > ! FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > ! VEC_safe_push (ipa_opt_pass, heap, node->ipa_transforms_to_apply, > ! (struct ipa_opt_pass_d *)pass); > } > > if (!current_function_decl) > *************** ipa_write_summaries (void) > *** 1705,1711 **** > { > struct cgraph_node *node = order[i]; > > ! if (node->analyzed) > { > /* When streaming out references to statements as part of some IPA > pass summary, the statements need to have uids assigned and the > --- 1704,1710 ---- > { > struct cgraph_node *node = order[i]; > > ! if (cgraph_function_with_gimple_body_p (node)) > { > /* When streaming out references to statements as part of some IPA > pass summary, the statements need to have uids assigned and the > *************** ipa_write_summaries (void) > *** 1718,1724 **** > pop_cfun (); > } > if (node->analyzed) > ! cgraph_node_set_add (set, node); > } > vset = varpool_node_set_new (); > > --- 1717,1723 ---- > pop_cfun (); > } > if (node->analyzed) > ! cgraph_node_set_add (set, node); > } > vset = varpool_node_set_new (); > > *************** function_called_by_processed_nodes_p (vo > *** 2036,2042 **** > { > if (e->caller->decl == current_function_decl) > continue; > ! if (!e->caller->analyzed) > continue; > if (TREE_ASM_WRITTEN (e->caller->decl)) > continue; > --- 2035,2041 ---- > { > if (e->caller->decl == current_function_decl) > continue; > ! if (!cgraph_function_with_gimple_body_p (e->caller)) > continue; > if (TREE_ASM_WRITTEN (e->caller->decl)) > continue; I think the rest looks reasonable. Thanks, Richard.