Steven Bosscher <stevenb....@gmail.com> writes: > On Sat, Jun 14, 2014 at 9:36 PM, Richard Sandiford wrote: >> Using a linked list gives a consistent 2% compile-time improvement for >> fold-const.ii -O0 and ~1% for various -O2 compiles I tried. The df >> routines do still show up high on the profile though. > > Can you explain a bit more about what shows up high?
For cc1plus -O0 on an oldish fold-const.ii I get: 3.19% cc1plus cc1plus [.] record_reg_classes(int, int, rtx_def**, machine_mode*, char const**, rtx_def*, reg_class*) [clone .constprop.5] 2.91% cc1plus cc1plus [.] cp_parser_skip_to_closing_parenthesis(cp_parser*, bool, bool, bool) 1.42% cc1plus cc1plus [.] cp_lexer_consume_token(cp_lexer*) 1.42% cc1plus cc1plus [.] df_ref_create_structure(df_ref_class, df_collection_rec*, rtx_def*, rtx_def**, basic_block_def*, df_insn_info*, df_ref_type, int) 1.31% cc1plus cc1plus [.] ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) 1.10% cc1plus cc1plus [.] df_ref_record(df_ref_class, df_collection_rec*, rtx_def*, rtx_def**, basic_block_def*, df_insn_info*, df_ref_type, int) 0.89% cc1plus cc1plus [.] find_costs_and_classes(_IO_FILE*) 0.89% cc1plus cc1plus [.] process_bb_node_lives(ira_loop_tree_node*) 0.86% cc1plus cc1plus [.] bitmap_set_bit(bitmap_head*, int) 0.82% cc1plus cc1plus [.] df_note_compute(bitmap_head*) 0.77% cc1plus libc-2.18.so [.] _int_malloc 0.76% cc1plus cc1plus [.] ix86_decompose_address(rtx_def*, ix86_address*) 0.76% cc1plus cc1plus [.] process_alt_operands(int) 0.75% cc1plus cc1plus [.] general_operand(rtx_def*, machine_mode) 0.72% cc1plus cc1plus [.] df_uses_record(df_collection_rec*, rtx_def**, df_ref_type, basic_block_def*, df_insn_info*, int) 0.72% cc1plus cc1plus [.] pool_alloc(alloc_pool_def*) 0.72% cc1plus cc1plus [.] lookup_name_real(tree_node*, int, int, bool, int, int) 0.67% cc1plus cc1plus [.] df_insn_refs_collect(df_collection_rec*, basic_block_def*, df_insn_info*) 0.67% cc1plus cc1plus [.] constrain_operands(int) 0.66% cc1plus cc1plus [.] for_each_rtx_1(rtx_def*, int, int (*)(rtx_def**, void*), void*) 0.63% cc1plus cc1plus [.] gimplify_expr(tree_node**, gimple_statement_base**, gimple_statement_base**, bool (*)(tree_node*), int) 0.62% cc1plus cc1plus [.] grokdeclarator(cp_declarator const*, cp_decl_specifier_seq*, decl_context, int, tree_node**) 0.62% cc1plus cc1plus [.] walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*), void*, pointer_set_t*, tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*, pointer_set_t*)) 0.59% cc1plus cc1plus [.] expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) 0.58% cc1plus cc1plus [.] df_ref_equal_p(df_ref_d*, df_ref_d*) 0.57% cc1plus cc1plus [.] lra_create_live_ranges(bool) 0.51% cc1plus cc1plus [.] lra_eliminate(bool, bool) 0.51% cc1plus cc1plus [.] extract_insn(rtx_def*) 0.51% cc1plus cc1plus [.] ix86_legitimate_address_p(machine_mode, rtx_def*, bool) 0.50% cc1plus libc-2.18.so [.] _IO_putc 0.49% cc1plus libc-2.18.so [.] memset 0.49% cc1plus cc1plus [.] df_lr_bb_local_compute(unsigned int) 0.47% cc1plus cc1plus [.] regstat_compute_ri() 0.47% cc1plus cc1plus [.] _cpp_lex_direct 0.46% cc1plus cc1plus [.] cp_parser_postfix_expression(cp_parser*, bool, bool, bool, bool, cp_id_kind*) 0.44% cc1plus cc1plus [.] htab_find_slot_with_hash 0.42% cc1plus libc-2.18.so [.] malloc_consolidate 0.42% cc1plus [kernel.kallsyms] [k] clear_page_c_e 0.41% cc1plus cc1plus [.] copy_rtx_if_shared_1(rtx_def**) 0.41% cc1plus cc1plus [.] cleanup_cfg(int) where df routines seem to be showing up a fair bit (3 in the top 10). I realise that can be misleading since it might just be that the df work is concentrated in a small number of functions. This is after the patches. malloc was in the top 5 before. Thanks, Richard