On 03/07/2018 11:13 AM, Martin Liška wrote: > V2: fixed headers in the last table of the PDF. > > Martin >
About the i386.ii -O2 -g, there's perf diff in between GCC 7 (base) and GCC 8: # Baseline Delta Abs Shared Object Symbol # ........ ......... .................... .............................................................................................................................................................................................................................. # +0.65% cc1plus [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_call_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_call_summary*> >::hash_entry, xcallocator>::find_slot_with_hash 0.18% +0.43% cc1plus [.] sreal::operator* +0.41% cc1plus [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_fn_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_fn_summary*> >::hash_entry, xcallocator>::find_slot_with_hash 0.07% +0.35% cc1plus [.] cgraph_node::find_replacement +0.33% cc1plus [.] profile_count::to_sreal_scale +0.33% cc1plus [.] predicate::probability +0.27% cc1plus [.] call_summary<ipa_call_summary*>::get 0.04% +0.25% cc1plus [.] sreal::operator/ +0.24% cc1plus [.] sreal::normalize 0.70% -0.23% [kernel] [.] 0xffffffff9c80019f +0.23% cc1plus [.] wide_int_to_tree_1 0.09% +0.22% cc1plus [.] sreal::operator+ +0.21% cc1plus [.] analyze_function_body 0.04% +0.19% cc1plus [.] dwarf2out_var_location 0.19% -0.19% cc1plus [.] compute_inlined_call_time +0.19% cc1plus [.] function_summary<ipa_fn_summary*>::get 0.30% -0.18% cc1plus [.] can_inline_edge_p 1.91% -0.16% cc1plus [.] bitmap_set_bit 0.74% -0.15% cc1plus [.] pre_and_rev_post_order_compute_fn 0.80% +0.15% [unknown] [.] 0xffffffff9c80019f +0.14% cc1plus [.] cleanup_control_flow_pre 0.81% -0.14% cc1plus [.] ggc_set_mark 0.13% +0.13% cc1plus [.] variably_modified_type_p 0.81% -0.13% cc1plus [.] et_splay 0.17% -0.13% cc1plus [.] curr_insn_transform +0.12% cc1plus [.] profile_count::from_gcov_type +0.12% cc1plus [.] process_alt_operands +0.12% cc1plus [.] can_inline_edge_by_limits_p 0.60% -0.11% cc1plus [.] estimate_calls_size_and_time 1.36% -0.11% libc-2.26.so [.] _int_malloc 0.27% +0.11% cc1plus [.] constrain_operands +0.11% cc1plus [.] bitmap_alloc 0.60% +0.11% cc1plus [.] hash_table<variable_hasher, xcallocator>::find_slot_with_hash +0.11% cc1plus [.] predicate::evaluate +0.10% cc1plus [.] vr_values::get_value_range 0.22% -0.10% cc1plus [.] nonzero_bits1 0.24% +0.10% cc1plus [.] big_speedup_p +0.09% cc1plus [.] get_class_binding_direct +0.09% cc1plus [.] maybe_hot_count_p 0.58% -0.09% cc1plus [.] walk_tree_1 0.06% +0.09% cc1plus [.] estimate_size_after_inlining +0.09% cc1plus [.] mark_use +0.09% cc1plus [.] ix86_hard_regno_call_part_clobbered 0.59% -0.09% cc1plus [.] bitmap_bit_p 0.23% -0.09% cc1plus [.] delete_trivially_dead_insns 0.41% -0.09% cc1plus [.] cse_insn 0.47% -0.09% cc1plus [.] (anonymous namespace)::dom_info::calc_idoms +0.08% cc1plus [.] hash_table<named_decl_hash, xcallocator>::find_slot_with_hash +0.08% cc1plus [.] profile_count::to_frequency 0.28% -0.08% libc-2.26.so [.] msort_with_tmp.part.0 0.90% -0.08% libc-2.26.so [.] _int_free +0.08% cc1plus [.] substitute_and_fold_engine::replace_uses_in 0.20% -0.08% cc1plus [.] rtx_equal_for_memref_p +0.08% cc1plus [.] predicate::add_clause 0.18% -0.07% cc1plus [.] update_callee_keys 0.67% -0.07% cc1plus [.] gt_ggc_mx_lang_tree_node Martin