well, my last comment may have been a little harsh on register VMs.

so, on each side there are costs and gains.
for this, I will focus on interpreters, since this is the main area where the debate is relevant.


stack VM, pros:
typically very simple to target;
instruction stream is usually fairly dense and easy to standardize;
..

cons:
more difficult to eliminate common expressions or apply other more agressive optimizations;
typically, many of the operations are unneeded.


register VM, pros:
interpreter can be faster than for a stack machine;
more highly optimized instruction streams can be produced;
a less complicated JIT process may be needed;
..

cons:
some overhead may be due if there is a mismatch between the interpreter and underlying arch; for the added complexity in targetting the thing, one is not likely to get that much "real" advantage in terms of performance; a register machine will often require a more complex instruction set than a stack machine;
..


so, for general purpose bytecode designs, I would still opt for a stack-machine based interpreter.
for raw performance, a register machine may offer better performance.

we can also note that most major stack-based VMs place rigid restrictions on the behavior of the opcode stream, such as requiring that for each execution of a given opcode, the local stack frame have exactly the same layout, ...


possible solution:
the cannonical bytecode is not directly interpreted, but is always translated (like in .NET); the bytecode may then be JITed to native machine code, or trans-compiled to an internal interpreted form (likely a register machine).

potentially, this translation process could perform optimizations (such as restructuring the instruction stream and eliminating subexpressions), or it could be a simple and direct mapping (say, directly mapping stack elements to registers).

given in my project, since I will likely be interpreting several different bytecode formats (JBC-cannonical, JBC-modified, AVM2, ...), it could make sense to use an internal regularized interpreter, and I could modify my existing plans to use a register VM here. I had initially considered using a modified form of JBC, but a register VM may make more sense.

this could also simplify the task of blurring the line between the native-compiled and interpreted parts of the project, given there is no real complaint for making a register-based interpreter act much the same as the processor...


hypothetical design:
registers are allocated in frames (like the locals and stack-frames in the JVM);
opcodes are 32 bit, and typically follow a generic 3-address form;
there will be several groups of registers differentiated by type;
"linking" may be used in place of the highly indirect structure of the JVM;
..


opcodes:
32-bit, native endianess, with the opcode in the low 8 bits, and args in higher bits.
A    op dest, left, right
B    op dest, immed16
C    op immed24
..

register groups:
I    integer (32 bit)
L    long (64 bit)
P    pointer / reference
F    float
D    double
X    128-bit (int128, float128, vectors, ...)
Y    wider variable-sized registers

one can allocate some number of each, and the groups are non-overlapping.
opcodes will thus be type specialized.

add_i 3, 7, 9    //add I7 and I9 putting result into I3
load_ip 4, 1    //load int from P1 and store in I4
..


issue:
how to "best" transfer function arguments?...

current leading is to use the lower registers for recieving args, and higher registers for composing arguments (this saves having to use a stack for composing arguments...).

for example:
int foo(int x, int y)
{
   int z;
   z=bar(x, y, x+y);
   return(z);
}

regs:
I: - x y, z, x y (x+y)

mov_i 4, 1
mov_i 5, 2
add_i 6, 1, 2
call_i 3, "Baz/bar(ii)i"
ret_i 3

from:
iload 1
iload 2
iload 1
iload 2
iadd
invokevirtual Baz/bar(II)I
ireturn


or such...



_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to