> -----Original Message-----
> From: David Malcolm <[email protected]>
> Sent: Friday, March 20, 2026 10:10
> To: Robert Dubner <[email protected]>; [email protected]
> Subject: Re: COBOL: Hoping for insight with middle-end computation time.
>
> On Thu, 2026-03-19 at 18:22 -0500, Robert Dubner wrote:
> > It happens that COBOL has the COMPUTE statement.  It takes the form
> > of,
> > for example, COMPUTE DDD = AAA + BBB.
> >
> > We implement that by creating a temporary variable, using that as the
> > target of an addition of AAA and BBB, and then doing an assignment to
> > DDD.
> > (Recall that COBOL variables can be quite complex, so we are a long
> > way
> > from being able to do this with an ADD_EXPR.)
> >
> > We have determined that the way I've been producing GENERIC for that
> > results in N-squared computation time somewhere in the middle end.
> >
> > I have been tearing my hair out trying to figure out what's causing
> > that
> > N-squared behavior.  I commented away the assignment, and I got rid
> > of the
> > arithmetic.  All that's left is the creation of the temporary, and
> > some IF
> > statements that are generated to test for errors along the way of the
> > computation.  (COBOL has a very rich error-detection and
> > exception-generating facility.
> >
> > The remaining GIMPLE for a single iteration (as shown by
> > -fdump-tree-gimple) is shown below.  The "phase opt and generate"
> > times
> > for repetitions of that GIMPLE are shown here:
> >
> >         phase opt      Factor
> > Repeats & generate
> >   1,000       0.17
> >   2,000       0.49      2.9
> >   4,000       1.55      3.2
> >   8,000       7.56      4.9
> >  16,000      49.31      6.5
> >  32,000     281.29      5.7
> >
> > I have been struggling with this for days.  Is there an explanation
> > for
> > why the following GIMPLE is resulting in that N-squared behavior?
>
> Hi Bob.  You cite the overall "phase opt and generate" times, but no
> data on how this is spread across the various optimization passes.
>
> What's the output of -ftime-report on your workload?
>
> In particular, are there any specific passes that are responsible for
> the growth in time (and thus where we can pinpoint a bug), or is the
> time evenly distributed across all of them?
>
> Sorry if this is a silly question
> Dave

The only silly thing is that in an egregious display of monumental arrogance 
and ignorance, some years back I decided to take on the problem of code 
generation for a new front end without having had any prior experience with 
GCC internals or compiler theory, and without access to anybody who actually 
knew anything.

I hope you've seen my response to Richard.  For a compilation with

phase opt and generate             :  14.95 ( 92%)   259M ( 88%)

the next big component is

thread pro- & epilogue             :  10.45 ( 64%)  4104  (  0%)

What that means is yet one more mystery to me.

Here's the entire output of -ftime-report-details:

Time variable                                  wall           GGC
 phase setup                        :   0.01 (  0%)   150k (  0%)
 phase parsing                      :   1.33 (  8%)    36M ( 12%)
 phase opt and generate             :  14.95 ( 92%)   259M ( 88%)
 phase last asm                     :   0.02 (  0%)   377k (  0%)
 phase finalize                     :   0.01 (  0%)     0  (  0%)
 garbage collection                 :   0.09 (  1%)     0  (  0%)
 callgraph construction             :   0.13 (  1%)    12M (  4%)
 `- CFG verifier                    :   0.02 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.16 (  1%)     0  (  0%)
 `- symout                          :   0.01 (  0%)    10M (  4%)
 `- tree SSA verifier               :   0.04 (  0%)     0  (  0%)
 `- garbage collection              :   0.01 (  0%)     0  (  0%)
 callgraph optimization             :   0.04 (  0%)     0  (  0%)
 `- dominance computation           :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.21 (  1%)     0  (  0%)
 `- tree SSA verifier               :   0.06 (  0%)     0  (  0%)
 `- garbage collection              :   0.03 (  0%)     0  (  0%)
 callgraph ipa passes               :   1.14 (  7%)    35M ( 12%)
 ipa function summary               :   0.00 (  0%)  1832  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 ipa inlining heuristics            :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 ipa comdats                        :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 ipa free lang data                 :   0.00 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 ipa free inline summary            :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.04 (  0%)     0  (  0%)
 ipa modref                         :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 cfg construction                   :   0.01 (  0%)  1232  (  0%)
 `- rebuild jump labels             :   0.01 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.03 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.03 (  0%)     0  (  0%)
 cfg cleanup                        :   0.01 (  0%)   208  (  0%)
 `- CFG verifier                    :   0.03 (  0%)     0  (  0%)
 CFG verifier                       :   0.40 (  2%)     0  (  0%)
 trivially dead code                :   0.02 (  0%)     0  (  0%)
 df scan insns                      :   0.07 (  0%)    96  (  0%)
 `- verify RTL sharing              :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 df live regs                       :   0.03 (  0%)     0  (  0%)
 df reg dead/unused notes           :   0.03 (  0%)  2628k (  1%)
 register information               :   0.01 (  0%)     0  (  0%)
 alias analysis                     :   0.01 (  0%)  1024k (  0%)
 rebuild jump labels                :   0.01 (  0%)   168  (  0%)
 parser (global)                    :   1.33 (  8%)    36M ( 12%)
 early inlining heuristics          :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 inline parameters                  :   0.03 (  0%)   672  (  0%)
 `- tree SSA verifier               :   0.02 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.04 (  0%)     0  (  0%)
 tree gimplify                      :   0.09 (  1%)    50M ( 17%)
 tree eh                            :   0.01 (  0%)   584  (  0%)
 tree CFG construction              :   0.02 (  0%)    13M (  5%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 tree CFG cleanup                   :   0.02 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 `- dominance computation           :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 tree SSA other                     :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 tree SSA rewrite                   :   0.03 (  0%)    20M (  7%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 `- tree operand scan               :   0.01 (  0%)  8470k (  3%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 tree operand scan                  :   0.01 (  0%)  8470k (  3%)
 tree SSA verifier                  :   0.36 (  2%)     0  (  0%)
 `- dominance computation           :   0.06 (  0%)     0  (  0%)
 tree STMT verifier                 :   0.88 (  5%)     0  (  0%)
 tree switch lowering               :   0.00 (  0%)     0  (  0%)
 `- tree SSA verifier               :   0.01 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 callgraph verifier                 :   0.06 (  0%)     0  (  0%)
 `- callgraph verifier              :   0.06 (  0%)     0  (  0%)
 dominance computation              :   0.12 (  1%)     0  (  0%)
 out of ssa                         :   0.03 (  0%)   952  (  0%)
 expand vars                        :   0.01 (  0%)  7040k (  2%)
 expand                             :   0.16 (  1%)    86M ( 29%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.01 (  0%)     0  (  0%)
 `- out of ssa                      :   0.03 (  0%)   952  (  0%)
 `- expand vars                     :   0.01 (  0%)  7040k (  2%)
 `- post expand cleanups            :   0.01 (  0%)    96  (  0%)
 post expand cleanups               :   0.01 (  0%)  3712  (  0%)
 `- rebuild jump labels             :   0.01 (  0%)   168  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 jump                               :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.03 (  0%)     0  (  0%)
 `- trivially dead code             :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.02 (  0%)     0  (  0%)
 loop init                          :   0.01 (  0%)  3040  (  0%)
 mode switching                     :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 integrated RA                      :   0.49 (  3%)    12M (  4%)
 `- register information            :   0.01 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 `- df reg dead/unused notes        :   0.03 (  0%)  2628k (  1%)
 `- alias analysis                  :   0.01 (  0%)  1024k (  0%)
 `- trivially dead code             :   0.01 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 `- df live regs                    :   0.01 (  0%)     0  (  0%)
 LRA non-specific                   :   0.22 (  1%)   352  (  0%)
 `- LRA hard reg assignment         :   0.01 (  0%)     0  (  0%)
 `- LRA virtuals elimination        :   0.17 (  1%)    23M (  8%)
 `- LRA create live ranges          :   0.01 (  0%)     0  (  0%)
 LRA virtuals elimination           :   0.17 (  1%)    23M (  8%)
 LRA create live ranges             :   0.01 (  0%)     0  (  0%)
 LRA hard reg assignment            :   0.01 (  0%)     0  (  0%)
 reload                             :   0.00 (  0%)    48  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 `- integrated RA                   :   0.05 (  0%)    96  (  0%)
 `- LRA non-specific                :   0.06 (  0%)    24  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 thread pro- & epilogue             :  10.45 ( 64%)  4104  (  0%)
 `- CFG verifier                    :   0.03 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 `- df live regs                    :   0.01 (  0%)     0  (  0%)
 machine dep reorg                  :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 shorten branches                   :   0.05 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 reg stack                          :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.04 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.02 (  0%)     0  (  0%)
 final                              :   0.10 (  1%)  3887k (  1%)
 `- verify RTL sharing              :   0.04 (  0%)     0  (  0%)
 `- symout                          :   0.01 (  0%)  7192k (  2%)
 symout                             :   0.04 (  0%)    18M (  6%)
 access analysis                    :   0.06 (  0%)    24  (  0%)
 `- tree SSA verifier               :   0.02 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.04 (  0%)     0  (  0%)
 `- dominance computation           :   0.01 (  0%)     0  (  0%)
 early local passes                 :   0.00 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.02 (  0%)     0  (  0%)
 rest of compilation                :   0.10 (  1%)  1754k (  1%)
 `- dominance computation           :   0.01 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.24 (  1%)     0  (  0%)
 `- garbage collection              :   0.04 (  0%)     0  (  0%)
 `- tree STMT verifier              :   0.13 (  1%)     0  (  0%)
 `- tree SSA verifier               :   0.06 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.13 (  1%)     0  (  0%)
 unaccounted post reload            :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 unaccounted late compilation       :   0.00 (  0%)     0  (  0%)
 `- verify RTL sharing              :   0.02 (  0%)     0  (  0%)
 `- CFG verifier                    :   0.01 (  0%)     0  (  0%)
 verify RTL sharing                 :   0.54 (  3%)     0  (  0%)
 TOTAL                              :  16.32          296M


>
>
> >
> > I agree that the conditionals are a bit convoluted, and Jim and I are
> > working to do COMPUTE in a much improved way.  But I still feel a
> > need to
> > understand what's going on!
> >
> > Thanks so much for any insight into what might be happening here.
> >
> >       _intermediate__stack501_503.0.0.data = &_stack501_data_517.0;
> >       _intermediate__stack501_503.0.0.capacity = 16;
> >       _intermediate__stack501_503.0.0.allocated = 16;
> >       _intermediate__stack501_503.0.0.offset = 0;
> >       _intermediate__stack501_503.0.0.name = &"_stack501"[0];
> >       _intermediate__stack501_503.0.0.picture = &""[0];
> >       _intermediate__stack501_503.0.0.initial = 0B;
> >       _intermediate__stack501_503.0.0.parent = 0B;
> >       _intermediate__stack501_503.0.0.occurs_lower = 0;
> >       _intermediate__stack501_503.0.0.occurs_upper = 0;
> >       _intermediate__stack501_503.0.0.attr = 4160;
> >       _intermediate__stack501_503.0.0.type = 6;
> >       _intermediate__stack501_503.0.0.level = 0;
> >       _intermediate__stack501_503.0.0.digits = 37;
> >       _intermediate__stack501_503.0.0.rdigits = 0;
> >       _intermediate__stack501_503.0.0.encoding = 1;
> >       _intermediate__stack501_503.0.0.alphabet = 0;
> >       D.2812 = 0;
> >       D.2813 = 0;
> >       _1013 = D.2812 & 18;
> >       if (_1013 != 0) goto <D.9353>; else goto <D.9354>;
> >       <D.9353>:
> >       goto <D.9355>;
> >       <D.9354>:
> >       D.2814 = 0;
> >       ..pa_erf.1519_1014 = ..pa_erf;
> >       D.2813 = D.2813 | ..pa_erf.1519_1014;
> >       if (D.2813 != 0) goto <D.9357>; else goto <D.9358>;
> >       <D.9357>:
> >       goto <D.9359>;
> >       <D.9358>:
> >       <D.9359>:
> >       <D.9355>:
> >
> >

Reply via email to