(Having missed last night's dinner and all the fun there...)

When Rémi started talking about this interpreter today, I thought it'd be
tailcall related.

In college, my introductory course to computer systems was based on the
book "Introduction to Computing Systems: From Bits & Gates to C & Beyond",
where it introduces a small educational 16-bit architecture called LC-2. We
were given a LC-2 simulator to run our "assembly" programs.
Later on, when I've learned enough Java to be dangerous, I wanted to write
a more powerful simulator (interpreter) of the LC-2 ISA in Java.

My first version was a most straightforward switch-threaded one. Nothing
much to be said about that.

The next version, wanting to mimic a indirect-threaded style, was
structured in very much the same way Rémi presented in the link, but
MethodHandle was still years from available so I didn't have the luxury of
using it. Instead, I modeled the opcode handlers as "single-method
objects", and do a virtual call at the end of each handler.
(So, instead of Rémi's MH calls, I'm just using a table lookup + virtual

public class NopHandler extends AbstractHandler // or implements Handler
  public void execute(State s) {
    HANDLERS[s.code[s.pc++]].execute(s); // dispatch

That's didn't work out quite well. The performance was okay-ish with small
program inputs. But HotSpot's JIT compilers doesn't implement tail call
optimizations (well, recursive inlining sort of counts...), so with a
long-enough program the stack blows up. And it didn't take that long of a
LC-2 assembler program to make it blow up.

I realized that I had to have proper tail calls to make it work. So I added
a central dispatch loop to serve as a trampoline, e.g.

Handler next = getNextHandler();
while (next != null) {
  next = next.execute(s);

where my Handler.execute() would still perform the lookup, but not the
actual invocation. As expected, the performance was no where as good as
directly making the virtual call at the tail position in each handler. This
central dispatch loop makes the unpredictable-ness of the Handler.execute()

Need a good trampoline. Need proper tail calls.

- Kris

On Wed, Aug 3, 2016 at 3:14 PM, John Rose <john.r.r...@oracle.com> wrote:

> On Aug 3, 2016, at 1:22 PM, Remi Forax <fo...@univ-mlv.fr> wrote:
> >
> > Charles ask if we can make a fast register interpreter in Java,
> > here is my take on a threaded-like interpreter
> >
> > https://gist.github.com/forax/f38b533e089217cfc4d0ae3c6e2de9c9
> Nicely done.  We were talking last night at dinner about making
> bytecode interpreters go fast, and this is helpful.
> I think there may be a combinator here, for bytecode dispatch loops.
> (This feels like the PIC combinator, both fundamental and tricky to get
> right.)
> A "bytecode" dispatch combinator might provide a pattern like the
> following:
> MethodHandle init, pred, step, fini, index;
> @NonNull MethodHandle[] tab;
> R dispatch(A… a) {
>    V v = init(a…);  // one here; there might be many v…
>    while (pred(a…)) {
>       v = step(v, a…);
>       // execute one "instruction":
>       int i = index(v, a…);
>       tab[i].invokeExact(v, a…);
>    }
>    return fini(V, a…);
> }
> The explicit table would be recopied internally to an @Stable array for
> constant folding.
> The various index values could be tracked and turned into traces (or some
> other profile
> driven structure) which would be compiled together, in specialized
> segments of the unrolled
> interpreter loop.
> Or maybe just the table lookup part alone makes a good combinator, to be
> paired
> with the loop combinators?
> Comments welcome…
> — John
> P.S. Another new RFE:
> experimental hook for creating heisenboxes
> https://bugs.openjdk.java.net/browse/JDK-8163133
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
mlvm-dev mailing list

Reply via email to