[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 Richard Biener changed: What|Removed |Added Last reconfirmed|2010-06-17 10:36:48 |2024-2-16 --- Comment #40 from Richard Biener --- Reconfirmed. On for GCC 14 we use about 2GB of ram on x86_64 with -O0 and 20s. With -O1 that regresses to 60s and a little less peak memory. callgraph ipa passes : 14.18 ( 23%) tree PTA : 16.43 ( 27%) And -O2 memory usage improves further at about the same compile-time.
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 Vladimir Makarov changed: What|Removed |Added CC||vmakarov at gcc dot gnu.org --- Comment #39 from Vladimir Makarov --- (In reply to Jan Hubicka from comment #38) > > Seems a lot of memory is taken by IRA, too. > > I played with the test too. 2^16 functions. It is a lot of insns and IRA/LRA is not only RA but in some way a code selector too which should keep a lot additional structures about insns. Memory numbers for IRA is just GCed memory allocation. We should be interesting in *peak* memory allocation. I don't see any peak for RA. ./cc1 -O2 with valgrind massif looks like (after recent Richard's patch changing IRA XNEWs on a memory pool): KB 206.8^: | :#::@:@::@::: | : ::: : :#::@:@::@::: |:: : : : : :#::@:@::@::: | : : : : : : :#::@:@::@::: | :@ :: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: | ::@::: : : : : : :#::@:@::@::: 0 +--->Mi 0 1.435 So it is about 200MB *peak* memory consumption and RA consuming only 4% of time which I think is an excellent result (usually RA takes more time). integrated RA : 7.21 ( 4%) 1.51 ( 3%) 9.37 ( 3%) 1581592 kB ( 38%) LRA non-specific : 1.35 ( 1%) 0.31 ( 1%) 1.95 ( 1%) 3584 kB ( 0%) LRA virtuals elimination : 0.38 ( 0%) 0.10 ( 0%) 0.43 ( 0%) 0 kB ( 0%) LRA reload inheritance : 0.16 ( 0%) 0.04 ( 0%) 0.14 ( 0%) 0 kB ( 0%) LRA create live ranges : 0.01 ( 0%) 0.01 ( 0%) 0.08 ( 0%) 0 kB ( 0%) LRA hard reg assignment: 0.23 ( 0%) 0.06 ( 0%) 0.17 ( 0%) 0 kB ( 0%)
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 Jan Hubicka changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|hubicka at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #38 from Jan Hubicka --- it is GCC10 but I finally managed to implement the incremental update here. Memory use is about 1.1GB but inliner finishes quite quickly: Time variable usr sys wall GGC phase setup: 0.00 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 1237 kB ( 0%) phase parsing : 1.29 ( 2%) 1.24 ( 6%) 2.54 ( 3%) 247897 kB ( 6%) phase lang. deferred : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 kB ( 0%) phase opt and generate : 56.81 ( 98%) 19.35 ( 94%) 76.27 ( 97%) 3859026 kB ( 94%) garbage collection : 0.84 ( 1%) 0.10 ( 0%) 0.93 ( 1%) 0 kB ( 0%) dump files : 3.28 ( 6%) 1.85 ( 9%) 5.30 ( 7%) 0 kB ( 0%) callgraph construction : 0.70 ( 1%) 0.28 ( 1%) 1.07 ( 1%) 99328 kB ( 2%) callgraph optimization : 1.38 ( 2%) 0.74 ( 4%) 2.03 ( 3%) 1026 kB ( 0%) callgraph functions expansion : 47.27 ( 81%) 15.51 ( 75%) 62.89 ( 80%) 2827825 kB ( 69%) callgraph ipa passes : 8.19 ( 14%) 3.26 ( 16%) 11.45 ( 15%) 709147 kB ( 17%) ipa function summary : 0.34 ( 1%) 0.08 ( 0%) 0.43 ( 1%) 97794 kB ( 2%) ipa dead code removal : 0.25 ( 0%) 0.01 ( 0%) 0.27 ( 0%) 0 kB ( 0%) ipa inheritance graph : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 0 kB ( 0%) ipa devirtualization : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) ipa cp : 0.23 ( 0%) 0.02 ( 0%) 0.27 ( 0%) 7169 kB ( 0%) ipa inlining heuristics: 0.19 ( 0%) 0.00 ( 0%) 0.22 ( 0%) 0 kB ( 0%) ipa function splitting : 0.02 ( 0%) 0.01 ( 0%) 0.06 ( 0%) 0 kB ( 0%) ipa comdats: 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 kB ( 0%) ipa various optimizations : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 0 kB ( 0%) ipa reference : 0.10 ( 0%) 0.00 ( 0%) 0.11 ( 0%) 0 kB ( 0%) ipa profile: 0.07 ( 0%) 0.00 ( 0%) 0.06 ( 0%) 0 kB ( 0%) ipa pure const : 0.45 ( 1%) 0.15 ( 1%) 0.47 ( 1%) 0 kB ( 0%) ipa icf: 0.22 ( 0%) 0.01 ( 0%) 0.23 ( 0%) 0 kB ( 0%) ipa SRA: 0.13 ( 0%) 0.00 ( 0%) 0.14 ( 0%) 5120 kB ( 0%) ipa free lang data : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 0 kB ( 0%) ipa free inline summary: 0.08 ( 0%) 0.00 ( 0%) 0.07 ( 0%) 0 kB ( 0%) cfg construction : 0.07 ( 0%) 0.01 ( 0%) 0.19 ( 0%) 0 kB ( 0%) cfg cleanup: 0.73 ( 1%) 0.23 ( 1%) 0.95 ( 1%) 0 kB ( 0%) trivially dead code: 0.30 ( 1%) 0.06 ( 0%) 0.30 ( 0%) 0 kB ( 0%) df scan insns : 0.81 ( 1%) 0.21 ( 1%) 0.93 ( 1%) 3072 kB ( 0%) df multiple defs : 0.28 ( 0%) 0.06 ( 0%) 0.41 ( 1%) 0 kB ( 0%) df reaching defs : 1.48 ( 3%) 0.20 ( 1%) 1.63 ( 2%) 0 kB ( 0%) df live regs : 1.12 ( 2%) 0.26 ( 1%) 1.33 ( 2%) 0 kB ( 0%) df live regs : 0.51 ( 1%) 0.19 ( 1%) 0.66 ( 1%) 0 kB ( 0%) df must-initialized regs : 0.11 ( 0%) 0.06 ( 0%) 0.14 ( 0%) 0 kB ( 0%) df use-def / def-use chains: 0.36 ( 1%) 0.04 ( 0%) 0.43 ( 1%) 0 kB ( 0%) df reg dead/unused notes : 1.69 ( 3%) 0.20 ( 1%) 1.81 ( 2%) 12288 kB ( 0%) register information : 0.38 ( 1%) 0.04 ( 0%) 0.39 ( 0%) 0 kB ( 0%) alias analysis : 0.82 ( 1%) 0.17 ( 1%) 1.15 ( 1%) 36865 kB ( 1%) alias stmt walking : 0.06 ( 0%) 0.04 ( 0%) 0.07 ( 0%) 0 kB ( 0%) register scan : 0.07 ( 0%) 0.03 ( 0%) 0.11 ( 0%) 0 kB ( 0%) rebuild jump labels: 0.16 ( 0%) 0.06 ( 0%) 0.14 ( 0%) 0 kB ( 0%) preprocessing : 0.39 ( 1%) 0.32 ( 2%) 0.49 ( 1%) 44508 kB ( 1%) lexical analysis : 0.32 ( 1%) 0.39 ( 2%) 0.73 ( 1%) 0 kB ( 0%) parser (global): 0.11 ( 0%) 0.08 ( 0%)
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #37 from rguenther at suse dot de --- On Mon, 22 Aug 2016, d.v.a at ngs dot ru wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 > > --- Comment #36 from __vic --- > What about 6.2? No, maybe GCC 7 if Honza finally manages to get to this...
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #36 from __vic --- What about 6.2?
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #35 from Richard Biener --- So ... too late for GCC 6 I guess.
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #34 from Jan Hubicka hubicka at ucw dot cz --- The problem is (as described earlier) the fact htat we sum size of all call statmts in function after every inline decision. Most of time is spent in calling estimate_edge_size_and_time: 79.95% cc1 cc1[.] _ZL28estimate_calls_size_and_timeP11cgraph_nodePiS1_S1_S1_j3vecIP9tree_node7va_heap6vl_ptrES2_I28ipa_polymorphic_call_contextS5_S6_ES2_IP21ipa_agg 2.21% cc1 libc-2.13.so [.] _int_malloc 0.59% cc1 libc-2.13.so [.] _int_free Updating summaries incrementally will solve it but at the moment do not see any really simple change for GCC-5 (i looked at this code couple times already because of this PR) Honza
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org --- Comment #33 from Richard Biener rguenth at gcc dot gnu.org --- Assigning to Honza - I wonder if there is any low-hanging fruit to improve things for GCC 5 still.
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #32 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Fri Mar 13 08:52:51 2015 New Revision: 221410 URL: https://gcc.gnu.org/viewcvs?rev=221410root=gccview=rev Log: 2015-03-12 Richard Biener rguent...@suse.de PR middle-end/44563 * tree-inline.c (gimple_expand_calls_inline): Walk BB backwards to avoid quadratic behavior with inline expansion splitting blocks. * tree-cfgcleanup.c (cleanup_tree_cfg_bb): Do not merge block with the successor if the predecessor will be merged with it. * tree-cfg.c (gimple_can_merge_blocks_p): We can't merge the entry block with its successor. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-cfg.c trunk/gcc/tree-cfgcleanup.c trunk/gcc/tree-inline.c
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Component|tree-optimization |ipa Known to fail||5.0 --- Comment #30 from Richard Biener rguenth at gcc dot gnu.org --- With all the patches I have for now we end up with a pure IPA issue: phase opt and generate : 193.97 (99%) usr 13.82 (93%) sys 207.75 (99%) wall 3311016 kB (94%) ggc ipa inlining heuristics : 140.48 (72%) usr 0.44 ( 3%) sys 141.13 (67%) wall 396289 kB (11%) ggc dominance computation : 2.99 ( 2%) usr 1.00 ( 7%) sys 3.89 ( 2%) wall 0 kB ( 0%) ggc integrated RA : 4.05 ( 2%) usr 0.85 ( 6%) sys 5.26 ( 3%) wall 1577496 kB (45%) ggc rest of compilation : 6.53 ( 3%) usr 1.67 (11%) sys 7.91 ( 4%) wall 155664 kB ( 4%) ggc unaccounted todo: 3.82 ( 2%) usr 1.07 ( 7%) sys 4.98 ( 2%) wall 0 kB ( 0%) ggc TOTAL : 195.4614.79 210.23 3514948 kB everything = 1% dropped. I wonder what that unaccounted todo is ;)
[Bug ipa/44563] GCC uses a lot of RAM when compiling a large numbers of functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563 --- Comment #31 from Richard Biener rguenth at gcc dot gnu.org --- Author: rguenth Date: Fri Mar 13 08:47:14 2015 New Revision: 221409 URL: https://gcc.gnu.org/viewcvs?rev=221409root=gccview=rev Log: 2015-03-10 Richard Biener rguent...@suse.de PR middle-end/44563 * tree-cfgcleanup.c (split_bb_on_noreturn_calls): Remove. (cleanup_tree_cfg_1): Do not call it. (execute_cleanup_cfg_post_optimizing): Fixup the CFG here. (fixup_noreturn_call): Mark the stmt as control altering. * tree-cfg.c (execute_fixup_cfg): Do not dump the function here. (pass_data_fixup_cfg): Produce a dump file. * tree-ssa-dom.c: Include tree-cfgcleanup.h. (need_noreturn_fixup): New global. (pass_dominator::execute): Fixup queued noreturn calls. (optimize_stmt): Queue calls that became noreturn for fixup. * tree-ssa-forwprop.c (pass_forwprop::execute): Likewise. * tree-ssa-pre.c: Include tree-cfgcleanup.h. (el_to_fixup): New global. (eliminate_dom_walker::before_dom_childre): Queue calls that became noreturn for fixup. (eliminate): Fixup queued noreturn calls. * tree-ssa-propagate.c: Include tree-cfgcleanup.h. (substitute_and_fold_dom_walker): New member stmts_to_fixup. (substitute_and_fold_dom_walker::before_dom_children): Queue alls that became noreturn for fixup. (substitute_and_fold): Fixup queued noreturn calls. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-cfg.c trunk/gcc/tree-cfgcleanup.c trunk/gcc/tree-ssa-dom.c trunk/gcc/tree-ssa-forwprop.c trunk/gcc/tree-ssa-pre.c trunk/gcc/tree-ssa-propagate.c