https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #79 from Richard Biener <rguenth at gcc dot gnu.org> ---
On the GCC 8 branch with -g -O0 (x86_64-linux) I get

 TOTAL                              :   9.59          0.49         10.09       
 551543 kB
10.53user 0.55system 0:11.08elapsed 99%CPU (0avgtext+0avgdata
1230908maxresident)k
0inputs+68384outputs (0major+379474minor)pagefaults 0swaps

On trunk with mem-stats the biggest offender is (as reported)

df_scan mw_reg                  alloc-pool.h:487 (df_scan_alloc)               
  1136         0 :  0.0%      483M       38M: 92.4%          32

going up to -Ofast makes compile-time explode again.  For the reduced testcase
it's still inlining and PTA:

 ipa inlining heuristics            :  41.55 ( 36%)   0.01 (  1%)  41.60 ( 36%)
   6333 kB (  2%)
 alias stmt walking                 :  14.73 ( 13%)   0.14 ( 18%)  14.98 ( 13%)
    164 kB (  0%)
 tree PTA                           :  37.93 ( 33%)   0.23 ( 30%)  38.16 ( 33%)
  31921 kB (  8%)
 TOTAL                              : 115.67          0.77        116.52       
 411925 kB
115.83user 0.81system 1:56.71elapsed 99%CPU (0avgtext+0avgdata
1034364maxresident)k
1624inputs+12624outputs (1major+308881minor)pagefaults 0swaps

trunk seems to behave similar:

 ipa inlining heuristics            :  53.74 ( 41%)   0.02 (  2%)  53.75 ( 40%)
   5428 kB (  1%)
 alias stmt walking                 :  14.98 ( 11%)   0.19 ( 23%)  15.16 ( 11%)
    165 kB (  0%)
 tree PTA                           :  39.70 ( 30%)   0.24 ( 29%)  39.92 ( 30%)
  31896 kB (  8%)
 TOTAL                              : 132.01          0.83        132.85       
 407617 kB
132.01user 0.86system 2:12.88elapsed 99%CPU (0avgtext+0avgdata
1034096maxresident)k
0inputs+8224outputs (0major+301553minor)pagefaults 0swaps

flat perf profile:

Samples: 510K of event 'instructions:p', Event count (approx.): 715615147320    
Overhead       Samples  Command  Shared Object     Symbol                       
   8.08%         95243  f951     f951              [.] bitmap_ior_into
   7.21%         25966  f951     f951              [.] sreal::operator*
   5.43%         19353  f951     f951              [.]
hash_table<hash_map<int_h
   5.20%         23167  f951     f951              [.] get_ref_base_and_extent
   4.93%         17947  f951     f951              [.]
profile_count::to_sreal_s
   4.37%         15865  f951     f951              [.] sreal::operator/
   3.45%         30532  f951     f951              [.] bitmap_set_bit
   3.41%         12159  f951     f951              [.]
hash_table<hash_map<int_h
   3.08%         11034  f951     f951              [.] default_binds_local_p_3
   3.08%         11146  f951     f951              [.]
hash_table<hash_map<int_h
   2.21%          7877  f951     f951              [.]
want_inline_small_functio
   1.93%          6874  f951     f951              [.] edge_badness
   1.87%          6675  f951     f951              [.]
compute_inlined_call_time

the ipa_fn_summary hash and edge_growth_cache / call_summary hashes are
oddly on top of the profile...

Reply via email to