https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119482

Moncef Mechri <moncef.mechri at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |moncef.mechri at gmail dot com

--- Comment #18 from Moncef Mechri <moncef.mechri at gmail dot com> ---
I was helping a colleague that reported that compilation started to hang
indefinitely after he upgraded from GCC 12 to GCC 14.

The problem was eventually narrowed down to a single usage of the flatten
attribute.

Here is the output of -ftime-report when compiling a single translation unit
(which I unfortunately cannot share here) with -O1 (I'm using GCC 15.1 here,
but the issue also occurs with GCC 14.2). You can see that it takes ~27 seconds
without flatten, and over 2 _hours_ with it.

Without flatten:

Time variable                                  wall           GGC
 phase setup                        :   0.01 (  0%)  2134k (  0%)
 phase parsing                      :  18.46 ( 67%)  3398M ( 81%)
 phase lang. deferred               :   2.73 ( 10%)   315M (  8%)
 phase opt and generate             :   5.48 ( 20%)   463M ( 11%)
 phase last asm                     :   0.88 (  3%)    14M (  0%)
 |name lookup                       :   1.02 (  4%)    54M (  1%)
 |overload resolution               :   5.42 ( 20%)   728M ( 17%)
 garbage collection                 :   0.97 (  4%)     0  (  0%)
 dump files                         :   0.02 (  0%)     0  (  0%)
 callgraph construction             :   0.72 (  3%)    41M (  1%)
 callgraph optimization             :   0.09 (  0%)  5246k (  0%)
 callgraph ipa passes               :   0.40 (  1%)    33M (  1%)
 ipa dead code removal              :   0.01 (  0%)     8  (  0%)
 ipa inheritance graph              :   0.01 (  0%)    18k (  0%)
 ipa inlining heuristics            :   0.02 (  0%)   296  (  0%)
 cfg construction                   :   0.01 (  0%)  1140k (  0%)
 cfg cleanup                        :   0.04 (  0%)   106k (  0%)
 trivially dead code                :   0.02 (  0%)     0  (  0%)
 df scan insns                      :   0.10 (  0%)   179k (  0%)
 df live regs                       :   0.04 (  0%)     0  (  0%)
 df reg dead/unused notes           :   0.03 (  0%)  1752k (  0%)
 register information               :   0.01 (  0%)     0  (  0%)
 alias analysis                     :   0.02 (  0%)   656k (  0%)
 rebuild jump labels                :   0.02 (  0%)     0  (  0%)
 preprocessing                      :   2.79 ( 10%)   732M ( 17%)
 parser (global)                    :   1.05 (  4%)   242M (  6%)
 parser struct body                 :   0.94 (  3%)   159M (  4%)
 parser enumerator list             :   0.02 (  0%)  2083k (  0%)
 parser function body               :   0.22 (  1%)    23M (  1%)
 parser inl. func. body             :   0.64 (  2%)    79M (  2%)
 parser inl. meth. body             :   0.81 (  3%)    85M (  2%)
 template instantiation             :  10.48 ( 38%)  1916M ( 46%)
 constant expression evaluation     :   0.70 (  3%)    43M (  1%)
 constraint normalization           :   0.02 (  0%)  2328k (  0%)
 constraint satisfaction            :   0.31 (  1%)    25M (  1%)
 early inlining heuristics          :   0.01 (  0%)   542k (  0%)
 inline parameters                  :   0.04 (  0%)   987k (  0%)
 integration                        :   0.04 (  0%)  8136k (  0%)
 tree gimplify                      :   0.09 (  0%)    18M (  0%)
 tree eh                            :   0.02 (  0%)  3371k (  0%)
 tree CFG construction              :   0.02 (  0%)  8172k (  0%)
 tree CFG cleanup                   :   0.05 (  0%)    15k (  0%)
 tree SSA rewrite                   :   0.03 (  0%)  4918k (  0%)
 tree SSA incremental               :   0.01 (  0%)   405k (  0%)
 tree operand scan                  :   0.01 (  0%)  7802k (  0%)
 tree RPO VN                        :   0.04 (  0%)  2403k (  0%)
 dominance computation              :   0.07 (  0%)     0  (  0%)
 out of ssa                         :   0.04 (  0%)   438k (  0%)
 expand vars                        :   0.02 (  0%)  1332k (  0%)
 expand                             :   0.21 (  1%)    22M (  1%)
 post expand cleanups               :   0.03 (  0%)  2291k (  0%)
 varconst                           :   0.02 (  0%)    48k (  0%)
 loop init                          :   0.02 (  0%)  4723k (  0%)
 integrated RA                      :   0.46 (  2%)    93M (  2%)
 LRA non-specific                   :   0.22 (  1%)   698k (  0%)
 LRA virtuals elimination           :   0.05 (  0%)  2436k (  0%)
 LRA reload inheritance             :   0.02 (  0%)  1304  (  0%)
 LRA create live ranges             :   0.04 (  0%)    11k (  0%)
 LRA hard reg assignment            :   0.02 (  0%)     0  (  0%)
 reload                             :   0.01 (  0%)    93k (  0%)
 thread pro- & epilogue             :   0.07 (  0%)  6298k (  0%)
 shorten branches                   :   0.07 (  0%)     0  (  0%)
 final                              :   1.05 (  4%)  9874k (  0%)
 symout                             :   4.32 ( 16%)   620M ( 15%)
 uninit var analysis                :   0.01 (  0%)    96k (  0%)
 access analysis                    :   0.06 (  0%)    15k (  0%)
 rest of compilation                :   0.19 (  1%)  9717k (  0%)
 TOTAL                              :  27.56         4194M


With flatten:

Time variable                                  wall           GGC
 phase setup                        :   0.01 (  0%)  2134k (  0%)
 phase parsing                      :  18.77 (  0%)  3411M ( 22%)
 phase lang. deferred               :   2.69 (  0%)   312M (  2%)
 phase opt and generate             :7681.85 ( 99%) 11707M ( 76%)
 phase last asm                     :  23.42 (  0%)    57M (  0%)
 |name lookup                       :   1.03 (  0%)    54M (  0%)
 |overload resolution               :   5.39 (  0%)   710M (  5%)
 garbage collection                 :   6.65 (  0%)     0  (  0%)
 dump files                         :   0.01 (  0%)     0  (  0%)
 callgraph construction             :   1.07 (  0%)    71M (  0%)
 callgraph optimization             :   5.15 (  0%)   267k (  0%)
 callgraph functions expansion      :7171.23 ( 93%)  5004M ( 32%)
 callgraph ipa passes               : 508.75 (  7%)  6446M ( 42%)
 ipa function summary               :   8.66 (  0%)   127M (  1%)
 ipa dead code removal              :   0.04 (  0%)    40  (  0%)
 ipa inheritance graph              :   0.01 (  0%)    18k (  0%)
 ipa inlining heuristics            :   0.24 (  0%)  1110k (  0%)
 ipa reference                      :   0.01 (  0%)     0  (  0%)
 ipa profile                        :   0.01 (  0%)     0  (  0%)
 ipa pure const                     :   0.20 (  0%)   146k (  0%)
 ipa free inline summary            :   0.07 (  0%)     0  (  0%)
 ipa modref                         :   0.88 (  0%)   821k (  0%)
 cfg construction                   :   0.54 (  0%)    23M (  0%)
 cfg cleanup                        :  10.21 (  0%)   546k (  0%)
 trivially dead code                :   3.48 (  0%)  2368  (  0%)
 df scan insns                      :   1.26 (  0%)  5280  (  0%)
 df reaching defs                   : 234.07 (  3%)     0  (  0%)
 df live regs                       :  57.27 (  1%)  2887k (  0%)
 df live&initialized regs           :  17.90 (  0%)     0  (  0%)
 df use-def / def-use chains        :   0.29 (  0%)     0  (  0%)
 df live reg subwords               :   0.67 (  0%)     0  (  0%)
 df reg dead/unused notes           :  10.49 (  0%)    66M (  0%)
 register information               :   3.62 (  0%)     0  (  0%)
 alias analysis                     :   4.83 (  0%)    92M (  1%)
 alias stmt walking                 :  53.67 (  1%)    21M (  0%)
 register scan                      :   0.25 (  0%)   644k (  0%)
 rebuild jump labels                :   1.18 (  0%)    24k (  0%)
 preprocessing                      :   2.81 (  0%)   729M (  5%)
 parser (global)                    :   1.11 (  0%)   259M (  2%)
 parser struct body                 :   0.94 (  0%)   160M (  1%)
 parser enumerator list             :   0.02 (  0%)  2083k (  0%)
 parser function body               :   0.22 (  0%)    16M (  0%)
 parser inl. func. body             :   0.73 (  0%)    80M (  1%)
 parser inl. meth. body             :   0.83 (  0%)    88M (  1%)
 template instantiation             :  10.46 (  0%)  1902M ( 12%)
 constant expression evaluation     :   0.77 (  0%)    54M (  0%)
 constraint normalization           :   0.02 (  0%)  2327k (  0%)
 constraint satisfaction            :   0.33 (  0%)    25M (  0%)
 early inlining heuristics          :   0.66 (  0%)   126M (  1%)
 inline parameters                  :   3.59 (  0%)    34M (  0%)
 integration                        :  26.88 (  0%)  5218M ( 34%)
 tree gimplify                      :   0.07 (  0%)    13M (  0%)
 tree eh                            :   1.47 (  0%)  2836k (  0%)
 tree CFG construction              :   0.02 (  0%)  6038k (  0%)
 tree CFG cleanup                   :  31.43 (  0%)    20M (  0%)
 tree copy propagation              :   3.65 (  0%)   764k (  0%)
 tree PTA                           : 451.17 (  6%)    22M (  0%)
 tree SSA other                     :   0.06 (  0%)     0  (  0%)
 tree SSA rewrite                   :   0.02 (  0%)  3603k (  0%)
 tree SSA incremental               :  13.58 (  0%)   191M (  1%)
 tree operand scan                  :  12.65 (  0%)   360M (  2%)
 dominator optimization             :  46.76 (  1%)   345M (  2%)
 backwards jump threading           :  14.22 (  0%)    88M (  1%)
 tree SRA                           :   1.67 (  0%)  9685k (  0%)
 tree CCP                           :  12.99 (  0%)    35M (  0%)
 tree split crit edges              :   0.16 (  0%)    30M (  0%)
 tree reassociation                 :   1.36 (  0%)   501k (  0%)
 tree FRE                           :  10.34 (  0%)    82M (  1%)
 tree RPO VN                        :   0.05 (  0%)  1031k (  0%)
 tree code sinking                  :   1.94 (  0%)    82M (  1%)
 tree linearize phis                :   1.17 (  0%)  3637k (  0%)
 tree backward propagate            :   0.35 (  0%)     0  (  0%)
 tree forward propagate             :   7.09 (  0%)    61M (  0%)
 tree phiprop                       :   0.32 (  0%)  9168  (  0%)
 tree conservative DCE              :   3.58 (  0%)    58k (  0%)
 tree aggressive DCE                :   4.18 (  0%)    95k (  0%)
 tree buildin call DCE              :   0.03 (  0%)     0  (  0%)
 tree DSE                           :   5.11 (  0%)    13M (  0%)
 PHI merge                          :   0.30 (  0%)    20M (  0%)
 tree loop invariant motion         :   3.66 (  0%)  1852k (  0%)
 tree canonical iv                  :   0.58 (  0%)    11M (  0%)
 scev constant prop                 :   0.03 (  0%)  1516k (  0%)
 complete unrolling                 :   1.01 (  0%)    18M (  0%)
 tree iv optimization               :   2.64 (  0%)   188M (  1%)
 tree copy headers                  :   5.97 (  0%)    24M (  0%)
 tree SSA uncprop                   :   0.60 (  0%)     0  (  0%)
 tree switch lowering               :   0.11 (  0%)  8985k (  0%)
 gimple CSE sin/cos                 :   0.07 (  0%)     0  (  0%)
 gimple expand pow                  :   0.07 (  0%)     0  (  0%)
 tree strlen optimization           :   4.64 (  0%)   598k (  0%)
 tree modref                        :   2.57 (  0%)  1203k (  0%)
 dominance frontiers                :   1.47 (  0%)     0  (  0%)
 dominance computation              :  19.37 (  0%)     0  (  0%)
 out of ssa                         :   3.01 (  0%)  1706k (  0%)
 expand vars                        :   2.22 (  0%)    96M (  1%)
 expand                             :   9.75 (  0%)   867M (  6%)
 post expand cleanups               :   0.98 (  0%)    22M (  0%)
 varconst                           :   0.02 (  0%)    48k (  0%)
 lower subreg                       :   0.66 (  0%)   171k (  0%)
 forward prop                       :   8.84 (  0%) 10136k (  0%)
 CSE                                :   5.70 (  0%)    10M (  0%)
 dead code elimination              :   2.67 (  0%)     0  (  0%)
 dead store elim1                   :   3.58 (  0%)    54M (  0%)
 dead store elim2                   :   3.19 (  0%)    46M (  0%)
 loop analysis                      :   0.03 (  0%)     0  (  0%)
 loop init                          :  15.21 (  0%)   136M (  1%)
 loop invariant motion              :  16.84 (  0%)  1856k (  0%)
 loop fini                          :   7.96 (  0%)   486k (  0%)
 branch prediction                  :   7.10 (  0%)    50M (  0%)
 combiner                           : 103.43 (  1%)   159M (  1%)
 if-conversion                      :   3.33 (  0%)    35M (  0%)
 integrated RA                      :  22.01 (  0%)   415M (  3%)
 LRA non-specific                   : 106.16 (  1%)    66M (  0%)
 LRA virtuals elimination           :   1.69 (  0%)    43M (  0%)
 LRA reload inheritance             :   0.01 (  0%)   244k (  0%)
 LRA create live ranges             :  13.16 (  0%)  6020k (  0%)
 LRA hard reg assignment            :   1.47 (  0%)     0  (  0%)
 reload                             :   0.07 (  0%)  2688  (  0%)
 reload CSE regs                    :   3.02 (  0%)    45M (  0%)
 thread pro- & epilogue             :   1.44 (  0%)   286k (  0%)
 if-conversion 2                    :   1.32 (  0%)   958k (  0%)
 combine stack adjustments          :   0.76 (  0%)     0  (  0%)
 hard reg cprop                     :   2.75 (  0%)  1096k (  0%)
 machine dep reorg                  :   0.49 (  0%)   610k (  0%)
 reorder blocks                     :   1.77 (  0%)    33M (  0%)
 shorten branches                   :   2.11 (  0%)  3072  (  0%)
 final                              :  34.04 (  0%)   413M (  3%)
 variable output                    :   0.01 (  0%)    49k (  0%)
 symout                             :  29.11 (  0%)  1611M ( 10%)
 variable tracking                  :6051.73 ( 78%)   196M (  1%)
 var-tracking dataflow              :  23.67 (  0%)    33M (  0%)
 var-tracking emit                  :   3.55 (  0%)   308M (  2%)
 tree if-combine                    :   0.18 (  0%)  2720k (  0%)
 if to switch conversion            :   0.91 (  0%)   255k (  0%)
 uninit var analysis                :   0.08 (  0%)     0  (  0%)
 straight-line strength reduction   :   0.74 (  0%)   278k (  0%)
 address lowering                   :   0.07 (  0%)     0  (  0%)
 access analysis                    :  10.72 (  0%)  6517k (  0%)
 rest of compilation                :  17.33 (  0%)    43M (  0%)
 remove unused locals               :  58.20 (  1%)  2296  (  0%)
 address taken                      :   2.82 (  0%)     0  (  0%)
 rebuild frequencies                :   0.05 (  0%)    11k (  0%)
 repair loop structures             :   0.14 (  0%)     0  (  0%)
 TOTAL                              :7726.74        15491M

Reply via email to