I'm not currently working on dynamic compilation, but I went to
Vasanth Bala's talks at the Dynamic Compilation Workshop before PoPL
and later at MIT. I, too, was intrigued by the possibility of doing
this sort of dynamic compilation for functional languages. I've given
the technical points some thought, so I thought I'd share what I'd
come up with.
First, Dynamo, Crusoe, and their ken focus on optimization of run-time
program traces. It was quite clear from Bala's talk that traces
should not be assumed to match function boundaries; we may compile
some trace through a function (the recursive case, for example) while
interpreting the rest (eg. the base case). The efficiency of the
technique depends on the fact that we compile _only_ the code which is
being frequently executed, as we have a limited cache to hold that
compiled code.
The biggest promise of this approach is that it eliminates most
unknown control flow. In C code this is things like function return
(a trace doesn't necessarily end when a function returns), calls to
dynamically-linked functions, and calls to function pointers/virtual
functions. All of these are far more common in Haskell than in C;
this makes the technique look very attractive indeed.
On the other hand, Dynamo at least does not always succeed. Very
unstructured symbolic code such as GCC and Go slow down and cause
Dynamo to give up. This is less promising; symbolic applications seem
like just the sort of thing we want to re-code in Haskell, as they
involve the complex data structure manipulation that is so much easier
with algebraic types.
A good step toward trying this out, though, would be to devise a
trace-compilable bytecode for Haskell. I'm not sure that e.g. G-code
really fits the bill---we want a bytecode that we can unroll
dynamically without having to keep track a lot of context information
(such as what's on the stack). We would also want "usual case"
bytecodes that would verify unknown control flow against a guess at
the actual control flow.
I suspect producing traces of pure bytecode which unroll control flow
with guesses ought to produce pretty good speedups when compared to
current bytecode interpreters (hugs, nhc); I'd be very interested to
see if this approach can compete with machine code that doesn't do any
guessing. In any case, this seems like a good proof of concept---if
we can't speed byte code up in this way, I wouldn't expect trace
compilation to do any better than any other kind of JIT compilation.
I'd love to get the chance to do this experiment some day.
-Jan-Willem Maessen
[EMAIL PROTECTED]
Here are some references, btw:
Dynamo: http://www.hpl.hp.com/cambridge/projects/Dynamo
Read the tech report from HP, as it contains a fair level of detail
and gives a pretty good feel for the strengths and weaknesses of the
approach. As I say, it really wins when there's lots of unknown
control flow (jumps to addresses in registers) that turns out to be
fairly predictable in practice.
The best chance I've seen at matching this statically was the GRIN
project, which Thomas Johnsson was pursuing a while back at Chalmers.
He used an exhaustive control-flow analysis to reduce each instance of
unknown control flow (eg. the forcing of a thunk) to a small set of
known destinations (eg. the possible thunks that actually reach that
program point). Combine that with interprocedural register allocation
(courtesy Urban Boquist) and you get the fastest Haskell code I've
seen. Any comments on this from the Chalmers folks? Alas I don't
think this ever made it past the prototyping stage, and of course it
required exhaustive whole-program analysis:
T. Johnsson, Analysing Heap Contents in a Graph Reduction Intermediate
Language, in Proc. Glasgow FP Workshop, Ullapool 1990.
How to resolve the control flow...
U. Boquist & T. Johnsson, The GRIN Project: A Highly Optimising Back
End for Lazy Functional Languages.
I found this on the web ages and ages ago. It gives some
results for small programs, and details the compilation
techniques.