> On 11/5/19 11:36 AM, Jan Hubicka wrote: > > Hi, > > this patch adds object allocators to manage IPA summaries. This reduces > > malloc overhead and fragmentation. I now get peak memory use 7.5GB instead > > of 10GB for firefox WPA because reduced fragmentation leads to less COWs > > after > > forks. > > That sounds promising. > > > Additional bonus is that we now have statistics gathered by mem-reports > > which makes my life easier, too. > > What's currently bad with the detailed memory statistics? I updated the > code that one should see the allocation for the underlying hash_map and > vec?
I currently get: -------------------------------------------------------------------------------------------------------------------------------------------- Pool name Allocation pool Pools Leak Peak Times Elt size -------------------------------------------------------------------------------------------------------------------------------------------- tree_scc lto/lto-common.c:2709 (read_cgraph_and_symbols) 1 0 : 0.0% 99M 3169k: 43.7% 32 IPA histogram ipa-profile.c:77 (__static_initialization_and_de 1 16 : 0.0% 16 1 : 0.0% 16 IPA-PROP ref descriptions ipa-prop.c:170 (__static_initialization_and_dest 1 226k: 0.3% 226k 9670 : 0.1% 24 function summary ipa-fnsummary.c:557 (ipa_fn_summary_alloc) 1 6145k: 7.0% 6257k 391k: 5.4% 16 function summary ipa-pure-const.c:136 (__base_ctor ) 1 6863k: 7.9% 9449k 590k: 8.1% 16 edge predicates ipa-fnsummary.c:93 (__static_initialization_and_ 1 8327k: 9.5% 8385k 209k: 2.9% 40 call summary ipa-sra.c:436 (__base_ctor ) 1 18M: 21.3% 21M 1393k: 19.2% 16 call summary ipa-fnsummary.h:276 (__base_ctor ) 1 46M: 54.0% 46M 1483k: 20.5% 32 -------------------------------------------------------------------------------------------------------------------------------------------- Pool name Allocation pool Pools Leak Peak Times Elt size -------------------------------------------------------------------------------------------------------------------------------------------- Total 9 85M -------------------------------------------------------------------------------------------------------------------------------------------- This is quite readable, though we may give them different names and update constructors. Not a big deal IMO. For GGC statistics I see: varpool.c:137 (create_empty) 7924k: 0.4% 0 : 0.0% 3214k: 0.2% 0 : 0.0% 87k cgraph.c:939 (cgraph_allocate_init_indirect_info 8566k: 0.4% 0 : 0.0% 1395k: 0.1% 0 : 0.0% 113k alias.c:1170 (record_alias_subset) 12M: 0.6% 0 : 0.0% 12k: 0.0% 99k: 0.1% 12k ipa-sra.c:2717 (isra_read_node_info) 12M: 0.6% 0 : 0.0% 4179k: 0.2% 21k: 0.0% 376k toplev.c:904 (realloc_for_line_map) 16M: 0.8% 0 : 0.0% 15M: 0.9% 144 : 0.0% 12 ipa-prop.c:278 (ipa_alloc_node_params) 16M: 0.8% 266k: 0.4% 0 : 0.0% 22k: 0.0% 366k symbol-summary.h:555 (allocate_new) 18M: 0.9% 0 : 0.0% 119k: 0.0% 0 : 0.0% 1171k ^^^ here we should point the caller of get_create ipa-fnsummary.c:3877 (inline_read_section) 28M: 1.4% 0 : 0.0% 552k: 0.0% 392k: 0.3% 261k lto-section-in.c:388 (lto_new_in_decl_state) 29M: 1.4% 0 : 0.0% 11M: 0.7% 0 : 0.0% 587k symtab.c:582 (create_reference) 35M: 1.7% 0 : 0.0% 50M: 2.9% 1199k: 0.9% 541k symbol-summary.h:64 (allocate_new) 46M: 2.2% 0 : 0.0% 2445k: 0.1% 0 : 0.0% 1168k ^^^ same here. stringpool.c:63 (alloc_node) 47M: 2.3% 0 : 0.0% 0 : 0.0% 0 : 0.0% 1217k ipa-prop.c:4480 (ipa_read_edge_info) 51M: 2.4% 0 : 0.0% 260k: 0.0% 404k: 0.3% 531k hash-table.h:801 (expand) 81M: 3.9% 0 : 0.0% 80M: 4.7% 88k: 0.1% 3349 ^^^ some of memory comes here which ought to be accounted to caller of expand. stringpool.c:41 (stringpool_ggc_alloc) 92M: 4.4% 0 : 0.0% 0 : 0.0% 6600k: 5.2% 1217k cgraph.h:2712 (allocate_cgraph_symbol) 148M: 7.1% 0 : 0.0% 115M: 6.7% 0 : 0.0% 767k cgraph.c:851 (create_edge) 149M: 7.1% 0 : 0.0% 27M: 1.6% 0 : 0.0% 1743k ipa-fnsummary.c:3936 (inline_read_section) 174M: 8.3% 0 : 0.0% 4190k: 0.2% 12M: 10.2% 391k lto/lto-common.c:204 (lto_read_in_decl_state) 200M: 9.6% 0 : 0.0% 65M: 3.8% 19M: 15.5% 1731k ipa-prop.c:4478 (ipa_read_edge_info) 210M: 10.0% 0 : 0.0% 1361k: 0.1% 17M: 14.4% 1171k tree-streamer-in.c:631 (streamer_alloc_tree) 647M: 30.8% 55M: 84.5% 1267M: 73.4% 64M: 52.1% 13M -------------------------------------------------------------------------------------------------------------------------------------------- GGC memory Leak Garbage Freed Overhead Times -------------------------------------------------------------------------------------------------------------------------------------------- Total 2100M:100.0% 65M:100.0% 1726M:100.0% 124M:100.0% 29M -------------------------------------------------------------------------------------------------------------------------------------------- One very odd thing is that at the end of WPA of firefox I see: hash-table.h:801 (expand) 100M: 2.9% 2088 : 0.0% 193M: 6.4% 90k: 0.0% 3379 tree-ssa-operands.c:265 (ssa_operand_alloc) 104M: 3.0% 0 : 0.0% 39M: 1.3% 0 : 0.0% 105k stringpool.c:41 (stringpool_ggc_alloc) 106M: 3.1% 0 : 0.0% 0 : 0.0% 7652k: 2.4% 1362k ipa-fnsummary.c:3936 (inline_read_section) 174M: 5.1% 0 : 0.0% 4190k: 0.1% 12M: 4.0% 391k ^^^ those are size_tale vectors that ought to be freed. lto/lto-common.c:204 (lto_read_in_decl_state) 200M: 5.8% 0 : 0.0% 65M: 2.2% 19M: 6.1% 1731k ipa-prop.c:4478 (ipa_read_edge_info) 210M: 6.1% 0 : 0.0% 1361k: 0.0% 17M: 5.7% 1171k ^^^ those are jumptables that ought to be freed too. cgraph.c:851 (create_edge) 285M: 8.3% 0 : 0.0% 33M: 1.1% 0 : 0.0% 3141k cgraph.h:2712 (allocate_cgraph_symbol) 417M: 12.1% 0 : 0.0% 121M: 4.0% 0 : 0.0% 1567k tree-streamer-in.c:631 (streamer_alloc_tree) 758M: 22.0% 96M: 23.0% 1267M: 41.7% 64M: 20.6% 15M -------------------------------------------------------------------------------------------------------------------------------------------- GGC memory Leak Garbage Freed Overhead Times -------------------------------------------------------------------------------------------------------------------------------------------- Total 3453M:100.0% 418M:100.0% 3039M:100.0% 313M:100.0% 49M -------------------------------------------------------------------------------------------------------------------------------------------- I am not sure where the problem is - it is GGC memory and we release those summaries after inlining so there should not be any pointers to them. At worst it should account to garbage, so it may be also some accounting bug. I suppose first thing to try is to breakpoint in the ggc walker of these and see if it shows up in the final ggc. Honza