Hey Nathan,

Absolutely - this is effective what we do for calls, where the bulk of the work 
of the slow case is performed by a shared routine.  We've experimented with 
using this technique more broadly in the early stages of developing the JIT, 
and back then there was a measurable performance degradation from the call 
overhead – but the code had changed a lot since then, and the tradeoffs and 
requirements may be different now (particularly across the varying hardware 
platforms the JIT has now been ported to).

However returning to an offset to a return address is probably not a good plan. 
 Upon executing a call, processors commonly cache the return address of the 
call instruction in a circular buffer used to predict return destinations.  
When it reaches the return instruction it pops a value from the return address 
stack to predict the destination of the return.  If you change the address 
you're going to get a mispredict and probably a pipe flush.  (We modify the 
return address in our exception handling path, but we don't expect exceptions 
to be high performance).  However bear in mind that there is no conditional 
call instruction on x86, so to eliminate the slow path altogether you'd have to 
litter the hot path with inverted branches over the calls out to the 
trampolines.  I'd suggest you'd be more likely to find success in keeping the 
hot path branching out to slow cases, and experiment with moving the bulk of 
the work of larger slow cases out into shared routines (which would be being 
called from the slow case).

This is certainly an interesting area to investigate.

cheers,
G.


On Jun 25, 2010, at 4:20 PM, Nathan Lawrence wrote:

> The size of our JIT generated code is a memory known issue.  According to 
> Oliver the slow cases for some of our operations is on the order of 128 
> bytes.  It occurred to me that we could reduce the JITed code by only 
> compiling the slow case once and having all of the subsequent generated code 
> jump to that specific slow case.  The issue with this is our slow cases jump 
> back to specific locations in the hot path, with potentially different values 
> on the stack, as opposed to a normal function which returns back to a very 
> specific state.  We can circumvent this issue by hand writing the assembly to 
> return to an offset of the return address with the required state.
> 
> What do people think?
> -- Nathan
> _______________________________________________
> squirrelfish-dev mailing list
> [email protected]
> http://lists.webkit.org/mailman/listinfo.cgi/squirrelfish-dev

_______________________________________________
squirrelfish-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/squirrelfish-dev

Reply via email to