https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Chip Kerchner from comment #12)
> > having always_inline across a deep call stack can exponentially increase 
> > compile-time
> 
> Do you think it would be worth requesting a feature to reduce the
> compilation times in situations like this?  Ideally exponentially is not a
> good thing.

Well, suppose you have

static __attribute__((always_inline)) inline void large_leaf () { /* large */ }

static __attribute__((always_inline)) inline void inter1 () { large_leaf (); }

static __attribute__((always_inline)) inline void inter2 () { inter1 (); inter1
(); }

static __attribute__((always_inline)) inline void inter3 () { inter2 (); inter2
(); }

void final () { inter3 (); inter3 (); }

then of course you end up with 8 copies of large_leaf in 'final' (you asked
for it).  Now, implementation wise it gets worse because we also fully
materialize the intermediate inter1, inter2 and inter3 with one and two
and four copies.  That's "only" double of the work but if it's single
call chains the overhead is larger.

There are specific cases where we could do better and IIRC some intermediate
updating of the costs blows up here as well (we build a "fat" callgraph
with inlined edges and inlined node clones).

In the end it requires somebody to sit down and see where to improve things
algorithmically - eventually eschewing the simple topological processing
for all inline candidates in favor of first resolving always-inlines in
the most optimal way, taking advantage of the fact that in principle
we do not need their bodies anymore.

I wasn't able to find a bug tracking this very specific issue so I created
one.  I have opened PR109509 for this.

Reply via email to