> -----Original Message-----
> From: David Malcolm <[email protected]>
> Sent: Friday, March 20, 2026 10:10
> To: Robert Dubner <[email protected]>; [email protected]
> Subject: Re: COBOL: Hoping for insight with middle-end computation time.
>
> On Thu, 2026-03-19 at 18:22 -0500, Robert Dubner wrote:
> > It happens that COBOL has the COMPUTE statement. It takes the form
> > of,
> > for example, COMPUTE DDD = AAA + BBB.
> >
> > We implement that by creating a temporary variable, using that as the
> > target of an addition of AAA and BBB, and then doing an assignment to
> > DDD.
> > (Recall that COBOL variables can be quite complex, so we are a long
> > way
> > from being able to do this with an ADD_EXPR.)
> >
> > We have determined that the way I've been producing GENERIC for that
> > results in N-squared computation time somewhere in the middle end.
> >
> > I have been tearing my hair out trying to figure out what's causing
> > that
> > N-squared behavior. I commented away the assignment, and I got rid
> > of the
> > arithmetic. All that's left is the creation of the temporary, and
> > some IF
> > statements that are generated to test for errors along the way of the
> > computation. (COBOL has a very rich error-detection and
> > exception-generating facility.
> >
> > The remaining GIMPLE for a single iteration (as shown by
> > -fdump-tree-gimple) is shown below. The "phase opt and generate"
> > times
> > for repetitions of that GIMPLE are shown here:
> >
> > phase opt Factor
> > Repeats & generate
> > 1,000 0.17
> > 2,000 0.49 2.9
> > 4,000 1.55 3.2
> > 8,000 7.56 4.9
> > 16,000 49.31 6.5
> > 32,000 281.29 5.7
> >
> > I have been struggling with this for days. Is there an explanation
> > for
> > why the following GIMPLE is resulting in that N-squared behavior?
>
> Hi Bob. You cite the overall "phase opt and generate" times, but no
> data on how this is spread across the various optimization passes.
>
> What's the output of -ftime-report on your workload?
>
> In particular, are there any specific passes that are responsible for
> the growth in time (and thus where we can pinpoint a bug), or is the
> time evenly distributed across all of them?
>
> Sorry if this is a silly question
> Dave
The only silly thing is that in an egregious display of monumental arrogance
and ignorance, some years back I decided to take on the problem of code
generation for a new front end without having had any prior experience with
GCC internals or compiler theory, and without access to anybody who actually
knew anything.
I hope you've seen my response to Richard. For a compilation with
phase opt and generate : 14.95 ( 92%) 259M ( 88%)
the next big component is
thread pro- & epilogue : 10.45 ( 64%) 4104 ( 0%)
What that means is yet one more mystery to me.
Here's the entire output of -ftime-report-details:
Time variable wall GGC
phase setup : 0.01 ( 0%) 150k ( 0%)
phase parsing : 1.33 ( 8%) 36M ( 12%)
phase opt and generate : 14.95 ( 92%) 259M ( 88%)
phase last asm : 0.02 ( 0%) 377k ( 0%)
phase finalize : 0.01 ( 0%) 0 ( 0%)
garbage collection : 0.09 ( 1%) 0 ( 0%)
callgraph construction : 0.13 ( 1%) 12M ( 4%)
`- CFG verifier : 0.02 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.16 ( 1%) 0 ( 0%)
`- symout : 0.01 ( 0%) 10M ( 4%)
`- tree SSA verifier : 0.04 ( 0%) 0 ( 0%)
`- garbage collection : 0.01 ( 0%) 0 ( 0%)
callgraph optimization : 0.04 ( 0%) 0 ( 0%)
`- dominance computation : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.21 ( 1%) 0 ( 0%)
`- tree SSA verifier : 0.06 ( 0%) 0 ( 0%)
`- garbage collection : 0.03 ( 0%) 0 ( 0%)
callgraph ipa passes : 1.14 ( 7%) 35M ( 12%)
ipa function summary : 0.00 ( 0%) 1832 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
ipa inlining heuristics : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
ipa comdats : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
ipa free lang data : 0.00 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
ipa free inline summary : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.04 ( 0%) 0 ( 0%)
ipa modref : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
cfg construction : 0.01 ( 0%) 1232 ( 0%)
`- rebuild jump labels : 0.01 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.03 ( 0%) 0 ( 0%)
`- CFG verifier : 0.03 ( 0%) 0 ( 0%)
cfg cleanup : 0.01 ( 0%) 208 ( 0%)
`- CFG verifier : 0.03 ( 0%) 0 ( 0%)
CFG verifier : 0.40 ( 2%) 0 ( 0%)
trivially dead code : 0.02 ( 0%) 0 ( 0%)
df scan insns : 0.07 ( 0%) 96 ( 0%)
`- verify RTL sharing : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
df live regs : 0.03 ( 0%) 0 ( 0%)
df reg dead/unused notes : 0.03 ( 0%) 2628k ( 1%)
register information : 0.01 ( 0%) 0 ( 0%)
alias analysis : 0.01 ( 0%) 1024k ( 0%)
rebuild jump labels : 0.01 ( 0%) 168 ( 0%)
parser (global) : 1.33 ( 8%) 36M ( 12%)
early inlining heuristics : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
inline parameters : 0.03 ( 0%) 672 ( 0%)
`- tree SSA verifier : 0.02 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.04 ( 0%) 0 ( 0%)
tree gimplify : 0.09 ( 1%) 50M ( 17%)
tree eh : 0.01 ( 0%) 584 ( 0%)
tree CFG construction : 0.02 ( 0%) 13M ( 5%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
tree CFG cleanup : 0.02 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
`- dominance computation : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
tree SSA other : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
tree SSA rewrite : 0.03 ( 0%) 20M ( 7%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
`- tree operand scan : 0.01 ( 0%) 8470k ( 3%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
tree operand scan : 0.01 ( 0%) 8470k ( 3%)
tree SSA verifier : 0.36 ( 2%) 0 ( 0%)
`- dominance computation : 0.06 ( 0%) 0 ( 0%)
tree STMT verifier : 0.88 ( 5%) 0 ( 0%)
tree switch lowering : 0.00 ( 0%) 0 ( 0%)
`- tree SSA verifier : 0.01 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
callgraph verifier : 0.06 ( 0%) 0 ( 0%)
`- callgraph verifier : 0.06 ( 0%) 0 ( 0%)
dominance computation : 0.12 ( 1%) 0 ( 0%)
out of ssa : 0.03 ( 0%) 952 ( 0%)
expand vars : 0.01 ( 0%) 7040k ( 2%)
expand : 0.16 ( 1%) 86M ( 29%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.01 ( 0%) 0 ( 0%)
`- out of ssa : 0.03 ( 0%) 952 ( 0%)
`- expand vars : 0.01 ( 0%) 7040k ( 2%)
`- post expand cleanups : 0.01 ( 0%) 96 ( 0%)
post expand cleanups : 0.01 ( 0%) 3712 ( 0%)
`- rebuild jump labels : 0.01 ( 0%) 168 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
jump : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.03 ( 0%) 0 ( 0%)
`- trivially dead code : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.02 ( 0%) 0 ( 0%)
loop init : 0.01 ( 0%) 3040 ( 0%)
mode switching : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
integrated RA : 0.49 ( 3%) 12M ( 4%)
`- register information : 0.01 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
`- df reg dead/unused notes : 0.03 ( 0%) 2628k ( 1%)
`- alias analysis : 0.01 ( 0%) 1024k ( 0%)
`- trivially dead code : 0.01 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
`- df live regs : 0.01 ( 0%) 0 ( 0%)
LRA non-specific : 0.22 ( 1%) 352 ( 0%)
`- LRA hard reg assignment : 0.01 ( 0%) 0 ( 0%)
`- LRA virtuals elimination : 0.17 ( 1%) 23M ( 8%)
`- LRA create live ranges : 0.01 ( 0%) 0 ( 0%)
LRA virtuals elimination : 0.17 ( 1%) 23M ( 8%)
LRA create live ranges : 0.01 ( 0%) 0 ( 0%)
LRA hard reg assignment : 0.01 ( 0%) 0 ( 0%)
reload : 0.00 ( 0%) 48 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
`- integrated RA : 0.05 ( 0%) 96 ( 0%)
`- LRA non-specific : 0.06 ( 0%) 24 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
thread pro- & epilogue : 10.45 ( 64%) 4104 ( 0%)
`- CFG verifier : 0.03 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
`- df live regs : 0.01 ( 0%) 0 ( 0%)
machine dep reorg : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
shorten branches : 0.05 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
reg stack : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.04 ( 0%) 0 ( 0%)
`- CFG verifier : 0.02 ( 0%) 0 ( 0%)
final : 0.10 ( 1%) 3887k ( 1%)
`- verify RTL sharing : 0.04 ( 0%) 0 ( 0%)
`- symout : 0.01 ( 0%) 7192k ( 2%)
symout : 0.04 ( 0%) 18M ( 6%)
access analysis : 0.06 ( 0%) 24 ( 0%)
`- tree SSA verifier : 0.02 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.04 ( 0%) 0 ( 0%)
`- dominance computation : 0.01 ( 0%) 0 ( 0%)
early local passes : 0.00 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.02 ( 0%) 0 ( 0%)
rest of compilation : 0.10 ( 1%) 1754k ( 1%)
`- dominance computation : 0.01 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.24 ( 1%) 0 ( 0%)
`- garbage collection : 0.04 ( 0%) 0 ( 0%)
`- tree STMT verifier : 0.13 ( 1%) 0 ( 0%)
`- tree SSA verifier : 0.06 ( 0%) 0 ( 0%)
`- CFG verifier : 0.13 ( 1%) 0 ( 0%)
unaccounted post reload : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
unaccounted late compilation : 0.00 ( 0%) 0 ( 0%)
`- verify RTL sharing : 0.02 ( 0%) 0 ( 0%)
`- CFG verifier : 0.01 ( 0%) 0 ( 0%)
verify RTL sharing : 0.54 ( 3%) 0 ( 0%)
TOTAL : 16.32 296M
>
>
> >
> > I agree that the conditionals are a bit convoluted, and Jim and I are
> > working to do COMPUTE in a much improved way. But I still feel a
> > need to
> > understand what's going on!
> >
> > Thanks so much for any insight into what might be happening here.
> >
> > _intermediate__stack501_503.0.0.data = &_stack501_data_517.0;
> > _intermediate__stack501_503.0.0.capacity = 16;
> > _intermediate__stack501_503.0.0.allocated = 16;
> > _intermediate__stack501_503.0.0.offset = 0;
> > _intermediate__stack501_503.0.0.name = &"_stack501"[0];
> > _intermediate__stack501_503.0.0.picture = &""[0];
> > _intermediate__stack501_503.0.0.initial = 0B;
> > _intermediate__stack501_503.0.0.parent = 0B;
> > _intermediate__stack501_503.0.0.occurs_lower = 0;
> > _intermediate__stack501_503.0.0.occurs_upper = 0;
> > _intermediate__stack501_503.0.0.attr = 4160;
> > _intermediate__stack501_503.0.0.type = 6;
> > _intermediate__stack501_503.0.0.level = 0;
> > _intermediate__stack501_503.0.0.digits = 37;
> > _intermediate__stack501_503.0.0.rdigits = 0;
> > _intermediate__stack501_503.0.0.encoding = 1;
> > _intermediate__stack501_503.0.0.alphabet = 0;
> > D.2812 = 0;
> > D.2813 = 0;
> > _1013 = D.2812 & 18;
> > if (_1013 != 0) goto <D.9353>; else goto <D.9354>;
> > <D.9353>:
> > goto <D.9355>;
> > <D.9354>:
> > D.2814 = 0;
> > ..pa_erf.1519_1014 = ..pa_erf;
> > D.2813 = D.2813 | ..pa_erf.1519_1014;
> > if (D.2813 != 0) goto <D.9357>; else goto <D.9358>;
> > <D.9357>:
> > goto <D.9359>;
> > <D.9358>:
> > <D.9359>:
> > <D.9355>:
> >
> >