well, my last comment may have been a little harsh on register VMs.
so, on each side there are costs and gains.
for this, I will focus on interpreters, since this is the main area where
the debate is relevant.
stack VM, pros:
typically very simple to target;
instruction stream is usually fairly dense and easy to standardize;
..
cons:
more difficult to eliminate common expressions or apply other more agressive
optimizations;
typically, many of the operations are unneeded.
register VM, pros:
interpreter can be faster than for a stack machine;
more highly optimized instruction streams can be produced;
a less complicated JIT process may be needed;
..
cons:
some overhead may be due if there is a mismatch between the interpreter and
underlying arch;
for the added complexity in targetting the thing, one is not likely to get
that much "real" advantage in terms of performance;
a register machine will often require a more complex instruction set than a
stack machine;
..
so, for general purpose bytecode designs, I would still opt for a
stack-machine based interpreter.
for raw performance, a register machine may offer better performance.
we can also note that most major stack-based VMs place rigid restrictions on
the behavior of the opcode stream, such as requiring that for each execution
of a given opcode, the local stack frame have exactly the same layout, ...
possible solution:
the cannonical bytecode is not directly interpreted, but is always
translated (like in .NET);
the bytecode may then be JITed to native machine code, or trans-compiled to
an internal interpreted form (likely a register machine).
potentially, this translation process could perform optimizations (such as
restructuring the instruction stream and eliminating subexpressions), or it
could be a simple and direct mapping (say, directly mapping stack elements
to registers).
given in my project, since I will likely be interpreting several different
bytecode formats (JBC-cannonical, JBC-modified, AVM2, ...), it could make
sense to use an internal regularized interpreter, and I could modify my
existing plans to use a register VM here. I had initially considered using a
modified form of JBC, but a register VM may make more sense.
this could also simplify the task of blurring the line between the
native-compiled and interpreted parts of the project, given there is no real
complaint for making a register-based interpreter act much the same as the
processor...
hypothetical design:
registers are allocated in frames (like the locals and stack-frames in the
JVM);
opcodes are 32 bit, and typically follow a generic 3-address form;
there will be several groups of registers differentiated by type;
"linking" may be used in place of the highly indirect structure of the JVM;
..
opcodes:
32-bit, native endianess, with the opcode in the low 8 bits, and args in
higher bits.
A op dest, left, right
B op dest, immed16
C op immed24
..
register groups:
I integer (32 bit)
L long (64 bit)
P pointer / reference
F float
D double
X 128-bit (int128, float128, vectors, ...)
Y wider variable-sized registers
one can allocate some number of each, and the groups are non-overlapping.
opcodes will thus be type specialized.
add_i 3, 7, 9 //add I7 and I9 putting result into I3
load_ip 4, 1 //load int from P1 and store in I4
..
issue:
how to "best" transfer function arguments?...
current leading is to use the lower registers for recieving args, and higher
registers for composing arguments (this saves having to use a stack for
composing arguments...).
for example:
int foo(int x, int y)
{
int z;
z=bar(x, y, x+y);
return(z);
}
regs:
I: - x y, z, x y (x+y)
mov_i 4, 1
mov_i 5, 2
add_i 6, 1, 2
call_i 3, "Baz/bar(ii)i"
ret_i 3
from:
iload 1
iload 2
iload 1
iload 2
iadd
invokevirtual Baz/bar(II)I
ireturn
or such...
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc