On 2/27/06, Dave Airlie <[EMAIL PROTECTED]> wrote:
>
> > That's odd. The dispatch routines are 16-byte aligned and the inlining
> > doesn't grow the size of the routine above 16-bytes. Did actual .text size
> > change, or just the library on-disk size?
> >
>
> My impression is that as caching is all that matters, the overhead of
> branching to a cache hot function vs the overhead of inlining the work is
> probaly going to very very small.. this sort of discussion has been going
> on in kernel land for ages.. computers don't work like they did 10-15
> years ago, inlining isn't the magic it once was.. smaller code size is
> much more important...
>

But the code size doesn't (or at least, shouldn't) change -- we're
just replacing some of those NOP padding bytes that are already there
with actual instructions.

i.e. current dispatch will require the use of two cachelines for every
dispatch function (except for any near enough to the beginning of the
table to share a cache line with _x86_64_get_dispatch -- not more than
 NewList, EndList, CallList, and CallLists) while inlining it will
only require a single cacheline..

While that may mean that the predecode info and/or trace cache data
for _x86_64_get_dispatch isn't shared by all the dispatch functions,
_x86_64_get_dispatch is only two instructions and I'm pretty sure that
functions of that size will always get inlined by a good compiler.
(The AMD manual suggests that anything less than 25 insns get inlined.
The Intel manual is, of course, useless and impossible to find.)

Reply via email to