On Fri, Jun 14, 2013 at 3:14 AM, Nathan Kurz <[email protected]> wrote:

> Smaller code is a definite win on modern processors.  It's not just
> the size of the L1 instruction cache, but loops can be executed out of
> an even smaller op code cache.  If you can fit a tight loop within
> this, it can be a large win.   Skipping the indirection should help
> also.   A correctly predicted jump should only add 1 cycle, whereas a
> read from L1 takes 4.

Assuming for a moment that thunks don't work out...

It's always bugged me that our global OFFSET variables take up so much space.
I wonder if we could alias multiple offset vars to the same memory location,
similar to the how we succeeded in aliasing multiple function names to the
same thunk in the LUCY-256-thunk-hack1 branch.

Actually, let's consider another question for a moment: What's the best we can
possibly achieve given our constraints of position-independent code and
dynamic vtable offsets?

We can't use a JIT compiler to generate executable code, and we can't fix up
the .text section of the shared object.  But let's assume that we can run an
arbitrary initialization routine at DSO load time and that we have the ability
to assign aliases.

I think it would be:

*   For each unique class/method-name that is invoked somewhere in the DSO,
    maintain a local copy of its offset variable in the .bss section of the
    DSO (.bss = non-constant static variables, implicitly initialized to 0).
*   During the initialization routine, assign the proper values to all local
    OFFSET var copies.
*   When invoking the method, access the local OFFSET var using a pc-relative
    load.

I think we manage something like that right now for methods defined within the
current parcel, right Nick?  But for methods defined in another parcel, the
offset var would be a global accessed via the GOT.

I don't think that scheme saves space, but it seems like the theoretical
minimum amount of indirection.  Plus, the jump from the call site ought to
have decent branch prediction, particularly for a loop which invokes the same
method on the same object over and over.

Marvin Humphrey

Reply via email to