Steven Bosscher <stevenb....@gmail.com> writes:
> On Sat, Jun 14, 2014 at 9:36 PM, Richard Sandiford wrote:
>> Using a linked list gives a consistent 2% compile-time improvement for
>> fold-const.ii -O0 and ~1% for various -O2 compiles I tried.  The df
>> routines do still show up high on the profile though.
>
> Can you explain a bit more about what shows up high?

For cc1plus -O0 on an oldish fold-const.ii I get:

     3.19%  cc1plus  cc1plus            [.] record_reg_classes(int, int, 
rtx_def**, machine_mode*, char const**, rtx_def*, reg_class*) [clone 
.constprop.5]
     2.91%  cc1plus  cc1plus            [.] 
cp_parser_skip_to_closing_parenthesis(cp_parser*, bool, bool, bool)
     1.42%  cc1plus  cc1plus            [.] cp_lexer_consume_token(cp_lexer*)
     1.42%  cc1plus  cc1plus            [.] 
df_ref_create_structure(df_ref_class, df_collection_rec*, rtx_def*, rtx_def**, 
basic_block_def*, df_insn_info*, df_ref_type, int)
     1.31%  cc1plus  cc1plus            [.] ggc_internal_alloc(unsigned long, 
void (*)(void*), unsigned long, unsigned long)
     1.10%  cc1plus  cc1plus            [.] df_ref_record(df_ref_class, 
df_collection_rec*, rtx_def*, rtx_def**, basic_block_def*, df_insn_info*, 
df_ref_type, int)
     0.89%  cc1plus  cc1plus            [.] find_costs_and_classes(_IO_FILE*)
     0.89%  cc1plus  cc1plus            [.] 
process_bb_node_lives(ira_loop_tree_node*)
     0.86%  cc1plus  cc1plus            [.] bitmap_set_bit(bitmap_head*, int)
     0.82%  cc1plus  cc1plus            [.] df_note_compute(bitmap_head*)
     0.77%  cc1plus  libc-2.18.so       [.] _int_malloc
     0.76%  cc1plus  cc1plus            [.] ix86_decompose_address(rtx_def*, 
ix86_address*)
     0.76%  cc1plus  cc1plus            [.] process_alt_operands(int)
     0.75%  cc1plus  cc1plus            [.] general_operand(rtx_def*, 
machine_mode)
     0.72%  cc1plus  cc1plus            [.] df_uses_record(df_collection_rec*, 
rtx_def**, df_ref_type, basic_block_def*, df_insn_info*, int)
     0.72%  cc1plus  cc1plus            [.] pool_alloc(alloc_pool_def*)
     0.72%  cc1plus  cc1plus            [.] lookup_name_real(tree_node*, int, 
int, bool, int, int)
     0.67%  cc1plus  cc1plus            [.] 
df_insn_refs_collect(df_collection_rec*, basic_block_def*, df_insn_info*)
     0.67%  cc1plus  cc1plus            [.] constrain_operands(int)
     0.66%  cc1plus  cc1plus            [.] for_each_rtx_1(rtx_def*, int, int 
(*)(rtx_def**, void*), void*)
     0.63%  cc1plus  cc1plus            [.] gimplify_expr(tree_node**, 
gimple_statement_base**, gimple_statement_base**, bool (*)(tree_node*), int)
     0.62%  cc1plus  cc1plus            [.] grokdeclarator(cp_declarator 
const*, cp_decl_specifier_seq*, decl_context, int, tree_node**)
     0.62%  cc1plus  cc1plus            [.] walk_tree_1(tree_node**, tree_node* 
(*)(tree_node**, int*, void*), void*, pointer_set_t*, tree_node* 
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*, 
pointer_set_t*))
     0.59%  cc1plus  cc1plus            [.] expand_expr_real_1(tree_node*, 
rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
     0.58%  cc1plus  cc1plus            [.] df_ref_equal_p(df_ref_d*, df_ref_d*)
     0.57%  cc1plus  cc1plus            [.] lra_create_live_ranges(bool)
     0.51%  cc1plus  cc1plus            [.] lra_eliminate(bool, bool)
     0.51%  cc1plus  cc1plus            [.] extract_insn(rtx_def*)
     0.51%  cc1plus  cc1plus            [.] 
ix86_legitimate_address_p(machine_mode, rtx_def*, bool)
     0.50%  cc1plus  libc-2.18.so       [.] _IO_putc
     0.49%  cc1plus  libc-2.18.so       [.] memset
     0.49%  cc1plus  cc1plus            [.] df_lr_bb_local_compute(unsigned int)
     0.47%  cc1plus  cc1plus            [.] regstat_compute_ri()
     0.47%  cc1plus  cc1plus            [.] _cpp_lex_direct
     0.46%  cc1plus  cc1plus            [.] 
cp_parser_postfix_expression(cp_parser*, bool, bool, bool, bool, cp_id_kind*)
     0.44%  cc1plus  cc1plus            [.] htab_find_slot_with_hash
     0.42%  cc1plus  libc-2.18.so       [.] malloc_consolidate
     0.42%  cc1plus  [kernel.kallsyms]  [k] clear_page_c_e
     0.41%  cc1plus  cc1plus            [.] copy_rtx_if_shared_1(rtx_def**)
     0.41%  cc1plus  cc1plus            [.] cleanup_cfg(int)

where df routines seem to be showing up a fair bit (3 in the top 10).
I realise that can be misleading since it might just be that the
df work is concentrated in a small number of functions.

This is after the patches.  malloc was in the top 5 before.

Thanks,
Richard

Reply via email to