On Tue, May 28, 2013 at 12:39 PM, Nick Wellnhofer <[email protected]> wrote:
> Here's an interesting blog post discussing this topic:
>
>     http://blog.omega-prime.co.uk/?p=121

Hmm...  And what we would really need to do may be even harder: we need to
assign aliases *at runtime*, when the DSO loads.

Maybe it's impossible.  But it sure is fun to mess around with such wacky
low-level hacks!

> The program below works for me on OS X when compiled with
>
>     cc -O2 -Wl,-alias,_thunk1,_Obj_Add,-alias,_thunk2,_Obj_Sub alias.c -o 
> alias

Thank you for the example.  It was enough to get past the hump and create a
proof of concept branch which works on OS X 10.8 (and probably only that
specific OS version).  I've committed it as `LUCY-256-thunk-hack1`.

> But the thunks should actually be written in assembler, so we don't have to 
> rely on compiler optimizations.

Here's what gdb prints out for the assembler.  I don't understand the `push`
and `pop` instructions.

Dump of assembler code for function cfish_thunk112:
0x0000000100075120 <cfish_thunk112+0>: push   %rbp
0x0000000100075121 <cfish_thunk112+1>: mov    %rsp,%rbp
0x0000000100075124 <cfish_thunk112+4>: mov    0x8(%rdi),%rax
0x0000000100075128 <cfish_thunk112+8>: pop    %rbp
0x0000000100075129 <cfish_thunk112+9>: jmpq   *0x70(%rax)
0x000000010007512c <cfish_thunk112+12>: nopl   0x0(%rax)
End of assembler dump.

> A single thunk per offset might be bad for branch prediction. But this could
> be worked around by providing separate thunks for each method.

I think we could evaluate that using cachegrind on c/t/test_lucy.

Marvin Humphrey

Reply via email to