Russel Winder:
I am getting hints that using --inline in some small (well
trivial
really) compute intensive codes actually makes performance
worse by a
few percent. Nothing experimental / statistically significant,
just
some anecdotal observations. Is this as it might be expected or
is it
something that needs more investigation?
Inlining (like most other things in life) is a matter of
tradeoffs. Inlining removes the costs of function calls, allows
some more small local optimizations, and increases code locality
in the code cache, but it also increases code size and this
sometimes increases cache misses. Finding the right amount of
inlining to use is hard. The inliner works heuristically (and
often only on the base of information known statically, unless
you are using a JIT or profile-driven optimization) to balance
such tradeoffs, trying to find something good enough on average.
But sometimes its choice is suboptimal.
GCC offers user-written compiler hints to forbid the inlining of
a specific function, or to almost-force the inlining of the
function.
Bye,
bearophile