https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119482
Moncef Mechri <moncef.mechri at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |moncef.mechri at gmail dot com --- Comment #18 from Moncef Mechri <moncef.mechri at gmail dot com> --- I was helping a colleague that reported that compilation started to hang indefinitely after he upgraded from GCC 12 to GCC 14. The problem was eventually narrowed down to a single usage of the flatten attribute. Here is the output of -ftime-report when compiling a single translation unit (which I unfortunately cannot share here) with -O1 (I'm using GCC 15.1 here, but the issue also occurs with GCC 14.2). You can see that it takes ~27 seconds without flatten, and over 2 _hours_ with it. Without flatten: Time variable wall GGC phase setup : 0.01 ( 0%) 2134k ( 0%) phase parsing : 18.46 ( 67%) 3398M ( 81%) phase lang. deferred : 2.73 ( 10%) 315M ( 8%) phase opt and generate : 5.48 ( 20%) 463M ( 11%) phase last asm : 0.88 ( 3%) 14M ( 0%) |name lookup : 1.02 ( 4%) 54M ( 1%) |overload resolution : 5.42 ( 20%) 728M ( 17%) garbage collection : 0.97 ( 4%) 0 ( 0%) dump files : 0.02 ( 0%) 0 ( 0%) callgraph construction : 0.72 ( 3%) 41M ( 1%) callgraph optimization : 0.09 ( 0%) 5246k ( 0%) callgraph ipa passes : 0.40 ( 1%) 33M ( 1%) ipa dead code removal : 0.01 ( 0%) 8 ( 0%) ipa inheritance graph : 0.01 ( 0%) 18k ( 0%) ipa inlining heuristics : 0.02 ( 0%) 296 ( 0%) cfg construction : 0.01 ( 0%) 1140k ( 0%) cfg cleanup : 0.04 ( 0%) 106k ( 0%) trivially dead code : 0.02 ( 0%) 0 ( 0%) df scan insns : 0.10 ( 0%) 179k ( 0%) df live regs : 0.04 ( 0%) 0 ( 0%) df reg dead/unused notes : 0.03 ( 0%) 1752k ( 0%) register information : 0.01 ( 0%) 0 ( 0%) alias analysis : 0.02 ( 0%) 656k ( 0%) rebuild jump labels : 0.02 ( 0%) 0 ( 0%) preprocessing : 2.79 ( 10%) 732M ( 17%) parser (global) : 1.05 ( 4%) 242M ( 6%) parser struct body : 0.94 ( 3%) 159M ( 4%) parser enumerator list : 0.02 ( 0%) 2083k ( 0%) parser function body : 0.22 ( 1%) 23M ( 1%) parser inl. func. body : 0.64 ( 2%) 79M ( 2%) parser inl. meth. body : 0.81 ( 3%) 85M ( 2%) template instantiation : 10.48 ( 38%) 1916M ( 46%) constant expression evaluation : 0.70 ( 3%) 43M ( 1%) constraint normalization : 0.02 ( 0%) 2328k ( 0%) constraint satisfaction : 0.31 ( 1%) 25M ( 1%) early inlining heuristics : 0.01 ( 0%) 542k ( 0%) inline parameters : 0.04 ( 0%) 987k ( 0%) integration : 0.04 ( 0%) 8136k ( 0%) tree gimplify : 0.09 ( 0%) 18M ( 0%) tree eh : 0.02 ( 0%) 3371k ( 0%) tree CFG construction : 0.02 ( 0%) 8172k ( 0%) tree CFG cleanup : 0.05 ( 0%) 15k ( 0%) tree SSA rewrite : 0.03 ( 0%) 4918k ( 0%) tree SSA incremental : 0.01 ( 0%) 405k ( 0%) tree operand scan : 0.01 ( 0%) 7802k ( 0%) tree RPO VN : 0.04 ( 0%) 2403k ( 0%) dominance computation : 0.07 ( 0%) 0 ( 0%) out of ssa : 0.04 ( 0%) 438k ( 0%) expand vars : 0.02 ( 0%) 1332k ( 0%) expand : 0.21 ( 1%) 22M ( 1%) post expand cleanups : 0.03 ( 0%) 2291k ( 0%) varconst : 0.02 ( 0%) 48k ( 0%) loop init : 0.02 ( 0%) 4723k ( 0%) integrated RA : 0.46 ( 2%) 93M ( 2%) LRA non-specific : 0.22 ( 1%) 698k ( 0%) LRA virtuals elimination : 0.05 ( 0%) 2436k ( 0%) LRA reload inheritance : 0.02 ( 0%) 1304 ( 0%) LRA create live ranges : 0.04 ( 0%) 11k ( 0%) LRA hard reg assignment : 0.02 ( 0%) 0 ( 0%) reload : 0.01 ( 0%) 93k ( 0%) thread pro- & epilogue : 0.07 ( 0%) 6298k ( 0%) shorten branches : 0.07 ( 0%) 0 ( 0%) final : 1.05 ( 4%) 9874k ( 0%) symout : 4.32 ( 16%) 620M ( 15%) uninit var analysis : 0.01 ( 0%) 96k ( 0%) access analysis : 0.06 ( 0%) 15k ( 0%) rest of compilation : 0.19 ( 1%) 9717k ( 0%) TOTAL : 27.56 4194M With flatten: Time variable wall GGC phase setup : 0.01 ( 0%) 2134k ( 0%) phase parsing : 18.77 ( 0%) 3411M ( 22%) phase lang. deferred : 2.69 ( 0%) 312M ( 2%) phase opt and generate :7681.85 ( 99%) 11707M ( 76%) phase last asm : 23.42 ( 0%) 57M ( 0%) |name lookup : 1.03 ( 0%) 54M ( 0%) |overload resolution : 5.39 ( 0%) 710M ( 5%) garbage collection : 6.65 ( 0%) 0 ( 0%) dump files : 0.01 ( 0%) 0 ( 0%) callgraph construction : 1.07 ( 0%) 71M ( 0%) callgraph optimization : 5.15 ( 0%) 267k ( 0%) callgraph functions expansion :7171.23 ( 93%) 5004M ( 32%) callgraph ipa passes : 508.75 ( 7%) 6446M ( 42%) ipa function summary : 8.66 ( 0%) 127M ( 1%) ipa dead code removal : 0.04 ( 0%) 40 ( 0%) ipa inheritance graph : 0.01 ( 0%) 18k ( 0%) ipa inlining heuristics : 0.24 ( 0%) 1110k ( 0%) ipa reference : 0.01 ( 0%) 0 ( 0%) ipa profile : 0.01 ( 0%) 0 ( 0%) ipa pure const : 0.20 ( 0%) 146k ( 0%) ipa free inline summary : 0.07 ( 0%) 0 ( 0%) ipa modref : 0.88 ( 0%) 821k ( 0%) cfg construction : 0.54 ( 0%) 23M ( 0%) cfg cleanup : 10.21 ( 0%) 546k ( 0%) trivially dead code : 3.48 ( 0%) 2368 ( 0%) df scan insns : 1.26 ( 0%) 5280 ( 0%) df reaching defs : 234.07 ( 3%) 0 ( 0%) df live regs : 57.27 ( 1%) 2887k ( 0%) df live&initialized regs : 17.90 ( 0%) 0 ( 0%) df use-def / def-use chains : 0.29 ( 0%) 0 ( 0%) df live reg subwords : 0.67 ( 0%) 0 ( 0%) df reg dead/unused notes : 10.49 ( 0%) 66M ( 0%) register information : 3.62 ( 0%) 0 ( 0%) alias analysis : 4.83 ( 0%) 92M ( 1%) alias stmt walking : 53.67 ( 1%) 21M ( 0%) register scan : 0.25 ( 0%) 644k ( 0%) rebuild jump labels : 1.18 ( 0%) 24k ( 0%) preprocessing : 2.81 ( 0%) 729M ( 5%) parser (global) : 1.11 ( 0%) 259M ( 2%) parser struct body : 0.94 ( 0%) 160M ( 1%) parser enumerator list : 0.02 ( 0%) 2083k ( 0%) parser function body : 0.22 ( 0%) 16M ( 0%) parser inl. func. body : 0.73 ( 0%) 80M ( 1%) parser inl. meth. body : 0.83 ( 0%) 88M ( 1%) template instantiation : 10.46 ( 0%) 1902M ( 12%) constant expression evaluation : 0.77 ( 0%) 54M ( 0%) constraint normalization : 0.02 ( 0%) 2327k ( 0%) constraint satisfaction : 0.33 ( 0%) 25M ( 0%) early inlining heuristics : 0.66 ( 0%) 126M ( 1%) inline parameters : 3.59 ( 0%) 34M ( 0%) integration : 26.88 ( 0%) 5218M ( 34%) tree gimplify : 0.07 ( 0%) 13M ( 0%) tree eh : 1.47 ( 0%) 2836k ( 0%) tree CFG construction : 0.02 ( 0%) 6038k ( 0%) tree CFG cleanup : 31.43 ( 0%) 20M ( 0%) tree copy propagation : 3.65 ( 0%) 764k ( 0%) tree PTA : 451.17 ( 6%) 22M ( 0%) tree SSA other : 0.06 ( 0%) 0 ( 0%) tree SSA rewrite : 0.02 ( 0%) 3603k ( 0%) tree SSA incremental : 13.58 ( 0%) 191M ( 1%) tree operand scan : 12.65 ( 0%) 360M ( 2%) dominator optimization : 46.76 ( 1%) 345M ( 2%) backwards jump threading : 14.22 ( 0%) 88M ( 1%) tree SRA : 1.67 ( 0%) 9685k ( 0%) tree CCP : 12.99 ( 0%) 35M ( 0%) tree split crit edges : 0.16 ( 0%) 30M ( 0%) tree reassociation : 1.36 ( 0%) 501k ( 0%) tree FRE : 10.34 ( 0%) 82M ( 1%) tree RPO VN : 0.05 ( 0%) 1031k ( 0%) tree code sinking : 1.94 ( 0%) 82M ( 1%) tree linearize phis : 1.17 ( 0%) 3637k ( 0%) tree backward propagate : 0.35 ( 0%) 0 ( 0%) tree forward propagate : 7.09 ( 0%) 61M ( 0%) tree phiprop : 0.32 ( 0%) 9168 ( 0%) tree conservative DCE : 3.58 ( 0%) 58k ( 0%) tree aggressive DCE : 4.18 ( 0%) 95k ( 0%) tree buildin call DCE : 0.03 ( 0%) 0 ( 0%) tree DSE : 5.11 ( 0%) 13M ( 0%) PHI merge : 0.30 ( 0%) 20M ( 0%) tree loop invariant motion : 3.66 ( 0%) 1852k ( 0%) tree canonical iv : 0.58 ( 0%) 11M ( 0%) scev constant prop : 0.03 ( 0%) 1516k ( 0%) complete unrolling : 1.01 ( 0%) 18M ( 0%) tree iv optimization : 2.64 ( 0%) 188M ( 1%) tree copy headers : 5.97 ( 0%) 24M ( 0%) tree SSA uncprop : 0.60 ( 0%) 0 ( 0%) tree switch lowering : 0.11 ( 0%) 8985k ( 0%) gimple CSE sin/cos : 0.07 ( 0%) 0 ( 0%) gimple expand pow : 0.07 ( 0%) 0 ( 0%) tree strlen optimization : 4.64 ( 0%) 598k ( 0%) tree modref : 2.57 ( 0%) 1203k ( 0%) dominance frontiers : 1.47 ( 0%) 0 ( 0%) dominance computation : 19.37 ( 0%) 0 ( 0%) out of ssa : 3.01 ( 0%) 1706k ( 0%) expand vars : 2.22 ( 0%) 96M ( 1%) expand : 9.75 ( 0%) 867M ( 6%) post expand cleanups : 0.98 ( 0%) 22M ( 0%) varconst : 0.02 ( 0%) 48k ( 0%) lower subreg : 0.66 ( 0%) 171k ( 0%) forward prop : 8.84 ( 0%) 10136k ( 0%) CSE : 5.70 ( 0%) 10M ( 0%) dead code elimination : 2.67 ( 0%) 0 ( 0%) dead store elim1 : 3.58 ( 0%) 54M ( 0%) dead store elim2 : 3.19 ( 0%) 46M ( 0%) loop analysis : 0.03 ( 0%) 0 ( 0%) loop init : 15.21 ( 0%) 136M ( 1%) loop invariant motion : 16.84 ( 0%) 1856k ( 0%) loop fini : 7.96 ( 0%) 486k ( 0%) branch prediction : 7.10 ( 0%) 50M ( 0%) combiner : 103.43 ( 1%) 159M ( 1%) if-conversion : 3.33 ( 0%) 35M ( 0%) integrated RA : 22.01 ( 0%) 415M ( 3%) LRA non-specific : 106.16 ( 1%) 66M ( 0%) LRA virtuals elimination : 1.69 ( 0%) 43M ( 0%) LRA reload inheritance : 0.01 ( 0%) 244k ( 0%) LRA create live ranges : 13.16 ( 0%) 6020k ( 0%) LRA hard reg assignment : 1.47 ( 0%) 0 ( 0%) reload : 0.07 ( 0%) 2688 ( 0%) reload CSE regs : 3.02 ( 0%) 45M ( 0%) thread pro- & epilogue : 1.44 ( 0%) 286k ( 0%) if-conversion 2 : 1.32 ( 0%) 958k ( 0%) combine stack adjustments : 0.76 ( 0%) 0 ( 0%) hard reg cprop : 2.75 ( 0%) 1096k ( 0%) machine dep reorg : 0.49 ( 0%) 610k ( 0%) reorder blocks : 1.77 ( 0%) 33M ( 0%) shorten branches : 2.11 ( 0%) 3072 ( 0%) final : 34.04 ( 0%) 413M ( 3%) variable output : 0.01 ( 0%) 49k ( 0%) symout : 29.11 ( 0%) 1611M ( 10%) variable tracking :6051.73 ( 78%) 196M ( 1%) var-tracking dataflow : 23.67 ( 0%) 33M ( 0%) var-tracking emit : 3.55 ( 0%) 308M ( 2%) tree if-combine : 0.18 ( 0%) 2720k ( 0%) if to switch conversion : 0.91 ( 0%) 255k ( 0%) uninit var analysis : 0.08 ( 0%) 0 ( 0%) straight-line strength reduction : 0.74 ( 0%) 278k ( 0%) address lowering : 0.07 ( 0%) 0 ( 0%) access analysis : 10.72 ( 0%) 6517k ( 0%) rest of compilation : 17.33 ( 0%) 43M ( 0%) remove unused locals : 58.20 ( 1%) 2296 ( 0%) address taken : 2.82 ( 0%) 0 ( 0%) rebuild frequencies : 0.05 ( 0%) 11k ( 0%) repair loop structures : 0.14 ( 0%) 0 ( 0%) TOTAL :7726.74 15491M