[Bug tree-optimization/61515] [4.10 Regression] Extremely long compile time for generated code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61515 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords|memory-hog | Blocks||47344 --- Comment #8 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Richard Biener from comment #7) (In reply to Richard Biener from comment #6) 4.9 at -Os takes 5min and ~2.2GB of ram (points-to takes 20%, DF 33%) trunk at -Os takes 15min and ~2.1GB of ram (dominator optimization takes 67%) trunk at -O1 takes 14min and ~2GB of ram (still DOM at 62%) So it seems that on trunk DOM regressed a lot. Confirmed as 4.10 regression. With -O1 -fno-tree-dominator-opts behavior is sane: df reaching defs: 86.99 (28%) usr 36.53 (66%) sys 124.04 (34%) wall 0 kB ( 0%) ggc tree PTA: 10.31 ( 3%) usr 0.15 ( 0%) sys 10.47 ( 3%) wall 1948 kB ( 0%) ggc tree SSA incremental: 98.24 (31%) usr 8.39 (15%) sys 106.63 (29%) wall 13570 kB ( 1%) ggc tree loop invariant motion: 14.46 ( 5%) usr 0.10 ( 0%) sys 14.54 ( 4%) wall 14421 kB ( 1%) ggc scev constant prop : 11.14 ( 4%) usr 0.02 ( 0%) sys 11.18 ( 3%) wall 28795 kB ( 2%) ggc TOTAL : 312.1055.72 368.92 1523092 kB 312.10user 55.84system 6:09.34elapsed 99%CPU (0avgtext+0avgdata 2073876maxresident)k 19664inputs+13880outputs (130major+38202143minor)pagefaults 0swaps (well, sane apart from DF and SSA incremental), but with DOM not disabled we get df reaching defs: 108.63 (11%) usr 8.83 (38%) sys 117.08 (12%) wall 0 kB ( 0%) ggc tree PTA: 10.51 ( 1%) usr 0.09 ( 0%) sys 10.60 ( 1%) wall 1948 kB ( 0%) ggc tree SSA incremental: 99.41 (10%) usr 9.17 (39%) sys 108.93 (11%) wall 13474 kB ( 1%) ggc dominator optimization : 617.08 (63%) usr 0.32 ( 1%) sys 618.97 (61%) wall 56878 kB ( 3%) ggc tree loop invariant motion: 2.15 ( 0%) usr 0.11 ( 0%) sys 2.25 ( 0%) wall 12012 kB ( 1%) ggc scev constant prop : 13.31 ( 1%) usr 0.02 ( 0%) sys 13.36 ( 1%) wall 28348 kB ( 2%) ggc TOTAL : 981.9823.25 1007.67 1625119 kB 981.98user 23.34system 16:47.79elapsed 99%CPU (0avgtext+0avgdata 2071112maxresident)k 184inputs+66416outputs (5major+18571554minor)pagefaults 0swaps (yes, this is with release checking) The testcase has _lots_ of loops (~11000), inside a big outer one, the maximum nesting isn't too big (10 from what I can see). SSA incremental is likely loop-closed SSA rewrite - didn't check. It should be possible to reduce the testcase somewhat if needed. Eventually the soulution for DOM is to disable the new path-based threading (if that turns out to be the issue) for -fno-expensive-optimizations (that is, optimize 2).
[Bug tree-optimization/61515] [4.10 Regression] Extremely long compile time for generated code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61515 --- Comment #9 from Richard Biener rguenth at gcc dot gnu.org --- Created attachment 32952 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32952action=edit stripped down testcase Stripped down, now it starts to fall off a bit: tree SSA incremental: 1.48 ( 7%) usr 0.42 (33%) sys 1.96 ( 8%) wall 1882 kB ( 1%) ggc dominator optimization : 10.57 (50%) usr 0.03 ( 2%) sys 12.44 (48%) wall 6918 kB ( 4%) ggc TOTAL : 21.12 1.2725.67 191269 kB (just cutting in half three times from the bottom and appending } until it compiles again)
[Bug tree-optimization/61515] [4.10 Regression] Extremely long compile time for generated code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61515 --- Comment #10 from Richard Biener rguenth at gcc dot gnu.org --- All time is spent in invalidate_equivalencies. /* A new value has been assigned to LHS. If necessary, invalidate any equivalences that are no longer valid. */ static void invalidate_equivalences (tree lhs, vectree *stack) { for (unsigned int i = 1; i num_ssa_names; i++) if (ssa_name (i) SSA_NAME_VALUE (ssa_name (i)) == lhs) record_temporary_equivalence (ssa_name (i), NULL_TREE, stack); if (SSA_NAME_VALUE (lhs)) record_temporary_equivalence (lhs, NULL_TREE, stack); } this is obviously quadratic ... nearly all calls are from static gimple record_temporary_equivalences_from_stmts_at_dest (edge e, vectree *stack, tree (*simplify) (gimple, gimple), bool backedge_seen) { ... else if (backedge_seen) invalidate_equivalences (gimple_get_lhs (stmt), stack); } return stmt; } I think you should record a bitmap of SSA names you need to invalidate equivalences to and only at the end of this do a single scan over all SSA names.
[Bug tree-optimization/61515] [4.10 Regression] Extremely long compile time for generated code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61515 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-06-16 CC||law at gcc dot gnu.org Version|unknown |4.10.0 Target Milestone|--- |4.10.0 Summary|Extremely long compile time |[4.10 Regression] Extremely |for generated code |long compile time for ||generated code Ever confirmed|0 |1 --- Comment #7 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Richard Biener from comment #6) 4.9 at -Os takes 5min and ~2.2GB of ram (points-to takes 20%, DF 33%) trunk at -Os takes 15min and ~2.1GB of ram (dominator optimization takes 67%) trunk at -O1 takes 14min and ~2GB of ram (still DOM at 62%) So it seems that on trunk DOM regressed a lot. Confirmed as 4.10 regression.