On Fri, Jun 14, 2013 at 3:14 AM, Nathan Kurz <[email protected]> wrote:
> OK. And the part of "real function" we care about here is that the
> function exists as an entry in the symbol table?
Yes. Thanks for helping to clarify.
> Likely Global, but the exact type will be determined by how we can make
> things work with each object file format.
The method invocation functions would definitely need entries in the PLT.
Even if the methods they represent are parcel-scoped (i.e. not visible outside
the DSO), the offset at which those methods can be found in the vtable can
only be determined at runtime.
Does that mean they can only be global, e.g. STV_DEFAULT visibility under ELF?
If so, that's an undesirable side effect -- we'd rather not expose
parcel-scoped methods via the DSO export mechanism.
>> If I've thought this through correctly, the technique should work because
>> all those method invocation wrappers would have compiled down to exactly the
>> same tail-call assembly anyway. The work of setting up the argument list is
>> already done at the invocation site and it depends on the signature, not the
>> wrapper code.
>
> Exactly, although as Nick points out below you would get better speed
> with an individual thunk per method.
Even if we give each method a dedicated thunk, the jump target is not a sure
thing. Consider a class Foo which defines Do_Stuff() and a class FooJr which
inherits from Foo and overrides Do_Stuff(). The following loop would result
in the same thunk dispatching to two different locations.
Foo *foo1 = Foo_new();
Foo *foo2 = (Foo*)FooJr_new();
for (int i = 0; i < 0x100000; i++) {
Foo_Do_Stuff(foo1);
Foo_Do_Stuff(foo2);
}
Additionally, since individual methods cannot be mapped to specific thunks
until runtime, in order to satisfy this need we'd have to generate multiple
thunks per method implementation in order to ensure that each was unique.
In theory we'd need a full complement of thunks for each method, which would
be a huge waste.
(I'm not explaining that very well, so let me know if it would help to
elaborate.)
> On Thu, Jun 13, 2013 at 9:13 PM, Marvin Humphrey <[email protected]>
> wrote:
>> On Tue, May 28, 2013 at 12:39 PM, Nick Wellnhofer <[email protected]>
>> wrote:
>
>
>> Hmm... And what we would really need to do may be even harder: we need to
>> assign aliases *at runtime*, when the DSO loads.
>
> Does it _need_ to do that, or would it be useful to start by aiming
> for just compile time?
It's a requirement. We need to be able to configure dynamic method dispatch
at runtime in order to support distributed development a la CPAN and avoid
imposing fragile ABI constraints on our users a la C++.
>> Maybe it's impossible. But it sure is fun to mess around with such wacky
>> low-level hacks!
>
> I think it's definitely possible, it's just a question of how ugly and
> low level it will have to be. My guess would be it can be quite
> clean, but will require something different for each platform. Worst
> case it involves dynamically creating, writing, and loading an object
> file.
Wow, that worst-case of yours is impressively ugly! On some level I'd
love to make it work just so we could brag about it. :)
> I resorted to cutting-and-pasting URL parts, but I think this is the
> diff from master.
> https://git-wip-us.apache.org/repos/asf?p=lucy.git;a=commitdiff;h=a665f5aec756e3973e3633770897b10b26cad875;hp=5890545507e185f3201966dfac733210ae88343a
> Is there an easier way to diff arbitrary commits via the web interface?
I would typically use the command line like so:
git fetch
git checkout LUCY-256-thunk-hack1
git diff master
# or alternately...
git diff HEAD~4 # `4` is the number of commits to go back.
I also recommend this to make the diffs colored:
git config --global color.ui true
I don't know about the web interface.
>> I think we could evaluate that using cachegrind on c/t/test_lucy.
>
> I don't think you can get reliable branch prediction information from
> cachegrind. It uses an ultra simplistic simulated prediction
> algorithm that is not representative. But 'perf' on Linux makes
> getting real branch prediction statistics very simple.
Thank you, that's helpful.
> Marvin, could you tell me more about the goal?
The goal of the thunk technique is just to minimize the amount of "hot" code.
Because it's possible to alias thunks, we'd need fewer of them than we need
OFFSET variables under the current scheme.
> In particular, can one assume that all the DSO's are loaded at the same
> time, or would you like to handle the case of a late load changing the
> hierarchy of earlier loaded modules?
DSO loads will have to be ordered, because the offsets in a child class will
depend on the offsets in ancestor classes.
Marvin Humphrey