> On 11/5/19 11:36 AM, Jan Hubicka wrote:
> > Hi,
> > this patch adds object allocators to manage IPA summaries. This reduces
> > malloc overhead and fragmentation.  I now get peak memory use 7.5GB instead
> > of 10GB for firefox WPA because reduced fragmentation leads to less COWs 
> > after
> > forks.
> 
> That sounds promising.
> 
> > Additional bonus is that we now have statistics gathered by mem-reports
> > which makes my life easier, too.
> 
> What's currently bad with the detailed memory statistics? I updated the
> code that one should see the allocation for the underlying hash_map and
> vec?

I currently get:

--------------------------------------------------------------------------------------------------------------------------------------------
Pool name                       Allocation pool                                 
  Pools       Leak            Peak            Times    Elt size
--------------------------------------------------------------------------------------------------------------------------------------------
tree_scc                        lto/lto-common.c:2709 (read_cgraph_and_symbols) 
     1         0 :  0.0%       99M     3169k: 43.7%          32
IPA histogram                   ipa-profile.c:77 
(__static_initialization_and_de     1        16 :  0.0%       16         1 :  
0.0%          16
IPA-PROP ref descriptions       ipa-prop.c:170 
(__static_initialization_and_dest     1       226k:  0.3%      226k     9670 :  
0.1%          24
function summary                ipa-fnsummary.c:557 (ipa_fn_summary_alloc)      
     1      6145k:  7.0%     6257k      391k:  5.4%          16
function summary                ipa-pure-const.c:136 (__base_ctor )             
     1      6863k:  7.9%     9449k      590k:  8.1%          16
edge predicates                 ipa-fnsummary.c:93 
(__static_initialization_and_     1      8327k:  9.5%     8385k      209k:  
2.9%          40
call summary                    ipa-sra.c:436 (__base_ctor )                    
     1        18M: 21.3%       21M     1393k: 19.2%          16
call summary                    ipa-fnsummary.h:276 (__base_ctor )              
     1        46M: 54.0%       46M     1483k: 20.5%          32
--------------------------------------------------------------------------------------------------------------------------------------------
Pool name                       Allocation pool                                 
  Pools       Leak            Peak            Times    Elt size
--------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                           
      9         85M
--------------------------------------------------------------------------------------------------------------------------------------------

This is quite readable, though we may give them different names and
update constructors. Not a big deal IMO.

For GGC statistics I see:

varpool.c:137 (create_empty)                          7924k:  0.4%        0 :  
0.0%     3214k:  0.2%        0 :  0.0%       87k
cgraph.c:939 (cgraph_allocate_init_indirect_info      8566k:  0.4%        0 :  
0.0%     1395k:  0.1%        0 :  0.0%      113k
alias.c:1170 (record_alias_subset)                      12M:  0.6%        0 :  
0.0%       12k:  0.0%       99k:  0.1%       12k
ipa-sra.c:2717 (isra_read_node_info)                    12M:  0.6%        0 :  
0.0%     4179k:  0.2%       21k:  0.0%      376k
toplev.c:904 (realloc_for_line_map)                     16M:  0.8%        0 :  
0.0%       15M:  0.9%      144 :  0.0%       12 
ipa-prop.c:278 (ipa_alloc_node_params)                  16M:  0.8%      266k:  
0.4%        0 :  0.0%       22k:  0.0%      366k
symbol-summary.h:555 (allocate_new)                     18M:  0.9%        0 :  
0.0%      119k:  0.0%        0 :  0.0%     1171k
 ^^^ here we should point the caller of get_create

ipa-fnsummary.c:3877 (inline_read_section)              28M:  1.4%        0 :  
0.0%      552k:  0.0%      392k:  0.3%      261k
lto-section-in.c:388 (lto_new_in_decl_state)            29M:  1.4%        0 :  
0.0%       11M:  0.7%        0 :  0.0%      587k
symtab.c:582 (create_reference)                         35M:  1.7%        0 :  
0.0%       50M:  2.9%     1199k:  0.9%      541k
symbol-summary.h:64 (allocate_new)                      46M:  2.2%        0 :  
0.0%     2445k:  0.1%        0 :  0.0%     1168k
 ^^^ same here.

stringpool.c:63 (alloc_node)                            47M:  2.3%        0 :  
0.0%        0 :  0.0%        0 :  0.0%     1217k
ipa-prop.c:4480 (ipa_read_edge_info)                    51M:  2.4%        0 :  
0.0%      260k:  0.0%      404k:  0.3%      531k
hash-table.h:801 (expand)                               81M:  3.9%        0 :  
0.0%       80M:  4.7%       88k:  0.1%     3349 
 ^^^ some of memory comes here which ought to be accounted to caller of
 expand.
stringpool.c:41 (stringpool_ggc_alloc)                  92M:  4.4%        0 :  
0.0%        0 :  0.0%     6600k:  5.2%     1217k
cgraph.h:2712 (allocate_cgraph_symbol)                 148M:  7.1%        0 :  
0.0%      115M:  6.7%        0 :  0.0%      767k
cgraph.c:851 (create_edge)                             149M:  7.1%        0 :  
0.0%       27M:  1.6%        0 :  0.0%     1743k
ipa-fnsummary.c:3936 (inline_read_section)             174M:  8.3%        0 :  
0.0%     4190k:  0.2%       12M: 10.2%      391k
lto/lto-common.c:204 (lto_read_in_decl_state)          200M:  9.6%        0 :  
0.0%       65M:  3.8%       19M: 15.5%     1731k
ipa-prop.c:4478 (ipa_read_edge_info)                   210M: 10.0%        0 :  
0.0%     1361k:  0.1%       17M: 14.4%     1171k
tree-streamer-in.c:631 (streamer_alloc_tree)           647M: 30.8%       55M: 
84.5%     1267M: 73.4%       64M: 52.1%       13M
--------------------------------------------------------------------------------------------------------------------------------------------
GGC memory                                              Leak          Garbage   
         Freed        Overhead            Times
--------------------------------------------------------------------------------------------------------------------------------------------
Total                                                 2100M:100.0%       
65M:100.0%     1726M:100.0%      124M:100.0%       29M
--------------------------------------------------------------------------------------------------------------------------------------------

One very odd thing is that at the end of WPA of firefox I see:

hash-table.h:801 (expand)                              100M:  2.9%     2088 :  
0.0%      193M:  6.4%       90k:  0.0%     3379 
tree-ssa-operands.c:265 (ssa_operand_alloc)            104M:  3.0%        0 :  
0.0%       39M:  1.3%        0 :  0.0%      105k
stringpool.c:41 (stringpool_ggc_alloc)                 106M:  3.1%        0 :  
0.0%        0 :  0.0%     7652k:  2.4%     1362k
ipa-fnsummary.c:3936 (inline_read_section)             174M:  5.1%        0 :  
0.0%     4190k:  0.1%       12M:  4.0%      391k
  ^^^ those are size_tale vectors that ought to be freed.  

lto/lto-common.c:204 (lto_read_in_decl_state)          200M:  5.8%        0 :  
0.0%       65M:  2.2%       19M:  6.1%     1731k
ipa-prop.c:4478 (ipa_read_edge_info)                   210M:  6.1%        0 :  
0.0%     1361k:  0.0%       17M:  5.7%     1171k
  ^^^ those are jumptables that ought to be freed too.

cgraph.c:851 (create_edge)                             285M:  8.3%        0 :  
0.0%       33M:  1.1%        0 :  0.0%     3141k
cgraph.h:2712 (allocate_cgraph_symbol)                 417M: 12.1%        0 :  
0.0%      121M:  4.0%        0 :  0.0%     1567k
tree-streamer-in.c:631 (streamer_alloc_tree)           758M: 22.0%       96M: 
23.0%     1267M: 41.7%       64M: 20.6%       15M
--------------------------------------------------------------------------------------------------------------------------------------------
GGC memory                                              Leak          Garbage   
         Freed        Overhead            Times
--------------------------------------------------------------------------------------------------------------------------------------------
Total                                                 3453M:100.0%      
418M:100.0%     3039M:100.0%      313M:100.0%       49M
--------------------------------------------------------------------------------------------------------------------------------------------

I am not sure where the problem is - it is GGC memory and we release
those summaries after inlining so there should not be any pointers to
them. At worst it should account to garbage, so it may be also some
accounting bug.

I suppose first thing to try is to breakpoint in the ggc walker of these
and see if it shows up in the final ggc.

Honza

Reply via email to