[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #44 from steven at gcc dot gnu dot org 2009-12-28 19:51 --- hashes100.c on x86_64: 3.4.6 4.2.4 4.3.3 4.4.2 4.5.0 -O0 1.371.4 1.591.911.84 -O1 2.074 4.444.684.89 -O2 3.575.967.087.487.6 -O3 3.788.7310.85 11.55 11.8 3.4.6 4.2.4 4.3.3 4.4.2 4.5.0 -O0 100%102%116%139%134% -O1 100%193%214%226%236% -O2 100%167%198%210%213% -O3 100%231%287%306%312% infcodes100.c on x86_64: 3.4.6 4.2.4 4.3.3 4.4.2 4.5.0 -O0 2.743.2 3.864.394.79 -O1 3.857.817.698.398.16 -O2 6.3511.81 12.813.18 14.71 -O3 6.7211.913.91 14.11 15.95 3.4.6 4.2.4 4.3.3 4.4.2 4.5.0 -O0 100%117%141%160%175% -O1 100%203%200%218%212% -O2 100%186%202%208%232% -O3 100%177%207%210%237% These are all best-of-three-runs timings. All compilers built with default release settings (--enable-checking=release, except 3.4.6, which was built with checking disabled by default). Even relative to gcc 4.3, gcc 4.5 is ~15% slower again. Bravo! :-( -- steven at gcc dot gnu dot org changed: What|Removed |Added Last reconfirmed|2005-02-10 15:55:22 |2009-12-28 19:51:37 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #45 from steven at gcc dot gnu dot org 2009-12-28 20:28 --- Profile for cc1 for SVN r155486 looks like this (all items with 0.5% time): Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 3.79 0.29 0.29 3718720 0.00 0.00 operand_equal_p 2.87 0.51 0.22 6543125 0.00 0.00 htab_find_slot_with_hash 2.15 0.68 0.17 18227570 0.00 0.00 bitmap_set_bit 2.09 0.84 0.16 1614 0.00 0.00 df_note_compute 1.83 0.98 0.14 _fini 1.63 1.10 0.13 3266034 0.00 0.00 mem_attrs_htab_eq 1.44 1.21 0.11 23919785 0.00 0.00 bitmap_bit_p 1.44 1.32 0.11 118624 0.00 0.00 free_ira_costs 1.31 1.42 0.10 4998409 0.00 0.00 bitmap_ior_into 1.31 1.52 0.10 7015 0.00 0.00 df_worklist_dataflow 1.24 1.62 0.10 1319725 0.00 0.00 bitmap_copy 0.91 1.69 0.07 2951661 0.00 0.00 note_stores 0.91 1.76 0.07 1709658 0.00 0.00 bitmap_ior_and_compl 0.91 1.83 0.07 1639154 0.00 0.00 et_splay 0.78 1.89 0.06 1552007 0.00 0.00 rtx_alloc_stat 0.78 1.95 0.06 6404265 0.00 0.00 bitmap_clear 0.78 2.01 0.06 4110147 0.00 0.00 bitmap_elt_insert_after 0.78 2.07 0.06 598552 0.00 0.00 constrain_operands 0.78 2.13 0.0671808 0.00 0.00 df_lr_bb_local_compute 0.78 2.19 0.06 909 0.00 0.00 substitute_and_fold 0.65 2.24 0.05 5297243 0.00 0.00 mark_all_vars_used_1 0.65 2.29 0.05 1652326 0.00 0.00 htab_find_with_hash 0.65 2.34 0.05 1008 0.00 0.00 init_alias_analysis 0.52 2.38 0.04 7167584 0.00 0.00 bitmap_elt_clear_from 0.52 2.42 0.04 4744012 0.00 0.00 walk_tree_1 0.52 2.46 0.04 4472070 0.00 0.00 is_gimple_reg 0.52 2.50 0.04 1727617 0.00 0.00 gsi_start_phis 0.52 2.54 0.04 1054932 0.00 0.00 for_each_rtx_1 0.52 2.58 0.04 824795 0.00 0.00 fold_binary_loc 0.52 2.62 0.04 733470 0.00 0.00 invalid_mode_change_p 0.52 2.66 0.04 396082 0.00 0.00 count_reg_usage 0.52 2.70 0.04 238716 0.00 0.00 df_chain_create 0.52 2.74 0.04 210927 0.00 0.00 cse_insn 0.52 2.78 0.04 106118 0.00 0.00 find_reloads 0.52 2.82 0.0441200 0.00 0.00 dfs_enumerate_from 0.52 2.86 0.04 202 0.00 0.00 run_scc_vn -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #46 from steven at gcc dot gnu dot org 2009-12-28 20:35 --- Same thing for hashes100.c (profile in comment #45 is for infcodes100.c), in both cases cc1 r155486 at -O2 on x86_64): Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 3.59 0.15 0.15 9725444 0.00 0.00 bitmap_set_bit 2.72 0.26 0.11 5517400 0.00 0.00 htab_find_slot_with_hash 1.98 0.34 0.08 _fini 1.73 0.41 0.07 14841100 0.00 0.00 bitmap_bit_p 1.73 0.48 0.07 1336700 0.00 0.00 nonzero_bits1 1.24 0.53 0.05 3776000 0.00 0.00 get_expr_value_id 1.24 0.58 0.05 1248000 0.00 0.00 cselib_lookup 1.24 0.63 0.0588002 0.00 0.00 free_ira_costs 1.24 0.68 0.0513300 0.00 0.00 df_worklist_dataflow 0.99 0.72 0.04 2893600 0.00 0.00 is_gimple_min_invariant 0.99 0.76 0.04 761910 0.00 0.00 extract_insn 0.99 0.80 0.04 739736 0.00 0.00 fold_binary_loc 0.99 0.84 0.04 577000 0.00 0.00 for_each_rtx_1 0.99 0.88 0.04 4300 0.00 0.00 remove_unused_locals 0.74 0.91 0.03 5120044 0.00 0.00 tree_strip_nop_conversions 0.74 0.94 0.03 5048648 0.00 0.00 pool_alloc 0.74 0.97 0.03 3330406 0.00 0.00 bitmap_clear 0.74 1.00 0.03 961752 0.00 0.00 rtx_alloc_stat 0.74 1.03 0.03 825000 0.00 0.00 loop_preheader_edge 0.74 1.06 0.03 643822 0.00 0.00 tree_code_size 0.74 1.09 0.03 413900 0.00 0.00 simplify_binary_operation 0.74 1.12 0.03 331100 0.00 0.00 walk_stmt_load_store_addr_ops 0.74 1.15 0.03 100200 0.00 0.00 cse_insn 0.74 1.18 0.0388700 0.00 0.00 make_compound_operation 0.74 1.21 0.0379600 0.00 0.00 add_control_edge 0.74 1.24 0.0335500 0.00 0.00 df_lr_bb_local_compute 0.74 1.27 0.03 300 0.00 0.00 find_costs_and_classes -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #47 from steven at gcc dot gnu dot org 2009-12-28 20:38 --- Created an attachment (id=19402) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19402action=view) profile for cc1 r155486 on x86_64, options -O2, for infcodes100.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #48 from steven at gcc dot gnu dot org 2009-12-28 20:39 --- Created an attachment (id=19403) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19403action=view) profile for cc1 r155486 on x86_64, options -O2, for hashes100.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #49 from steven at gcc dot gnu dot org 2009-12-28 20:46 --- For hashes100.c, combine+IRA+expand+tree-PRE accounts for 1/3 of the total compile time. For infcodes100.c, the profile is more flat, but IRA+expand still account for 1/4 of the total compile time. Why is IRA so slow? -- steven at gcc dot gnu dot org changed: What|Removed |Added CC||vmakarov at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #43 from rguenth at gcc dot gnu dot org 2009-08-04 12:26 --- GCC 4.3.4 is being released, adjusting target milestone. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.3.4 |4.3.5 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687
[Bug tree-optimization/18687] [4.3/4.4/4.5 Regression] ~50% compile time regression
--- Comment #42 from jsm28 at gcc dot gnu dot org 2009-03-31 16:42 --- Closing 4.2 branch. -- jsm28 at gcc dot gnu dot org changed: What|Removed |Added Summary|[4.2/4.3/4.4/4.5 Regression]|[4.3/4.4/4.5 Regression] |~50% compile time regression|~50% compile time regression Target Milestone|4.2.5 |4.3.4 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687