https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #6 from Erich Keane <erich.keane at intel dot com> ---
(In reply to Jonathan Wakely from comment #5)
> (In reply to Erich Keane from comment #3)
> > As you know, "extern template" is a hint to the compiler that we don't need
> > to emit the template as a way to save on compile time.
> > 
> > Both GCC and clang will NOT instantiate these templates in O0 mode. 
> > However, in O1+ modes, both will actually still instantiate the templates in
> > the frontend, BUT only for 'inline' functions.  Basically, we're using
> > 'inline' as a heuristic that there is benefit in sending these functions to
> > the optimizer (basically, sacrificing the compile time gained by 'extern
> > template' in exchange for a better inlining experience).
> 
> Hmm, I've seen different behaviours for clang and g++ in this respect, with
> clang inlining a lot more of std::string's members. So I'm surprised they
> use the same heuristic.
> 
> Do they both instantiate the function templates marked 'inline' even at -O1?
> Presumably not at -O0.

My understanding of Clang is based on a brief debugging session. My
understanding of GCC's behavior here is a brief amount of time messing around
on godbolt. I could very well be incorrect.


> 
> > In the submitter's case, the std::string constructor calls "_M_construct". 
> > The constructor is inlined, but _M_construct is not, since it never gets to
> > the optimizer.
> > 
> > libc++ uses an __init function to do the same thing as _M_construct, however
> > IT is marked inline, and thus doesn't have the problem.
> > 
> > I believe the submitter wants to have you mark more of the functions in
> > extern-templated classes 'inline' so that it matches the heuristic better.
> 
> And that's what I don't want to do. I think it's wrong for the human to say
> "inline this!" because humans are stupid (well, I am anyway). And I don't
> want to have to examine the GIMPLE/asm again for every new GCC release to
> decide whether 'inline' is still in the right places (and whether the answer
> should be different for every different version of Clang or ICC!)
> 
> And when I say "I don't want to" I mean "I am never ever going to".
> 
> > I don't think that there is a good way to change the compiler itself without
> > making 'extern template' absolutely meaningless.
> 
> I absolutely disagree.
> 
> It would still give a reduction in object file size for cases where the
> compiler decides not to inline, and still make compilation much faster for
> -O0 and -O1.

That is fair, I guess it would slightly reduce 'link' time because of that. I
doubt people would be willing to put up with the STL compiling that much slower
though (which seems to be the major user of this feature in my experience).

> One property of -O2 and -O3 is that we try to optimize aggressively even if
> that takes a long time to compile. So we could instantiate things that have
> an explicit instantiation declaration (thus doing "redundant" work) to see
> if inlining them would be beneficial. That would take longer to compile, but
> might produce faster code. If the heuristics decide the instantiation ends
> up too big to inline, it could just discard it (because we know there's a
> definition elsewhere).

That is essentially what the frontends DO, except only with the 'inline'
functions.  If the inliner chooses to not inline it, it gets thrown out (since
we've marked it 'available externally').

> If the only way to get that is to mark every function as 'inline' (and then
> "trick" the compiler into doing all that extra work even at -O1?) then we
> might as well add 'inline' to every single function template in <string> and
> <istream>, <ostream>, <streambuf> etc. so they're all potential candiates
> for inlining.
> 
> And if we have to mark every single function as 'inline' then maybe the
> compiler shouldn't be using it as a hint.

I don't think the idea is to mark EVERY function 'inline', simply ones that are
pretty tiny and really good candidates for inlining.

Reply via email to