On Sat, Mar 21, 2026 at 10:25 AM Jakub Jelinek <[email protected]> wrote:
>
> On Fri, Mar 20, 2026 at 10:17:50PM -0500, Robert Dubner wrote:
> > 10,000 repeats of that code in the C++ program compiles in 1.36 seconds.
> > 20,000 repeats 3.18 seconds.
> > 40,000 repeats 7.92 seconds.
> >
> > 10,000 repeats in the COBOL program 16.76 seconds.
> > 20,000 repeats in the COBOL program 97.40 seconds.
> > 10,000 repeats in the COBOL program 551.56 seconds.
>
> Perhaps also look at -fdump-tree-ssa-vops dump differences too, that will
> make it clearer if there aren't differences in what is TREE_ADDRESSABLE and
> what is not, or what is a global var and what could have been rewritten into
> SSA form.
Looking at -gimple there's a single scope block with try/finally and all
clobbers in the finally block. I suppose given that cobol emits a single
function only we could elide the end-of-function clobbers for it.
The D.nnn are registers, not memory AFAICS.
What's a bit odd is that there seems to be global variables
called __gg__treeplet_{1,2,3}{f,o,s} where we store addresses
of aaa, etc, into:
__gg__treeplet_1f.58_103 = __gg__treeplet_1f;
_104 = 0;
_105 = __gg__treeplet_1f.58_103 + _104;
*_105 = &aaa.73.0;
and the actual computation happens in
__gg__add_fixed_phase1 (2, 2, 0, 0,
__gg__arithmetic_rounds.64_121, 0, &D.321);
I assume which seems to indicate that argumets get passed via global variables
rather than formal arguments. That might in practice help code
generation at -O0
though.
I suspect it's simply very many life variables (the D.nnnn) that need stack
space and make the x86 stack var analysis code slow.
I do wonder about
_intermediate__stack327_329.0.0.data = &_stack327_data_330.0;
_intermediate__stack327_329.0.0.capacity = 16;
_intermediate__stack327_329.0.0.allocated = 16;
_intermediate__stack327_329.0.0.offset = 0;
_intermediate__stack327_329.0.0.name = &"_stack327"[0];
_intermediate__stack327_329.0.0.picture = &""[0];
_intermediate__stack327_329.0.0.initial = 0B;
_intermediate__stack327_329.0.0.parent = 0B;
_intermediate__stack327_329.0.0.occurs_lower = 0;
_intermediate__stack327_329.0.0.occurs_upper = 0;
_intermediate__stack327_329.0.0.attr = 4160;
_intermediate__stack327_329.0.0.type = 6;
_intermediate__stack327_329.0.0.level = 0;
_intermediate__stack327_329.0.0.digits = 37;
_intermediate__stack327_329.0.0.rdigits = 0;
_intermediate__stack327_329.0.0.encoding = 1;
_intermediate__stack327_329.0.0.alphabet = 0;
__gg__initialize_variable_clean (&_intermediate__stack327_329.0.0, 32768);
so why do we have inline initialization of the variable but also a call to
apparently do sth similar? That seems to be abstraction that is at least
oddly designed.
With all of the above it would help if those temporary variables like
_intermediate__stack327_329.0.0 or the D.nnn would be wrapped
in their own scope block as to limit their lifetime given at least
the stack ones are address-taken. So in C terms, have
{
compute ddd = aaa + bbb
}
{
compute ddd = aaa + bbb
}
...
so the frontend emits those temporaries into its own scope wrapping
the computes to limit their lifetime. Scopes in GENERIC are BLOCKs
and in the IL you'd have BIND_EXPRs.
Richard.
>
> Jakub
>