Bytecode interpreters can offer unsurpassed code compactness, ...[This box has] only 2MiB of L2 cache, and presumably something like
16-64KiB of L1 cache.  Thrashing the cache is soundly punished.

One problem, as you point out earlier, is that the question is not size of native code loop vs. size of bytecoded loop, but size of native loop vs. size of (bytecoded loop + its working path through the bytecode interpreter).

I haven't used MSVC in ages, but their compiler used to have the option to compile to a bytecode -- still, I think this was only ever useful for space, not for speed. (for that, one must try to arrange the functionality to ensure only the outermost loops are being interpreted)

I know of no other reason to implement an interpreter using bytecode.

So I'm surprised it's such a popular thing to do!  I think the reason
is probably that code space and compilation time used to be quite
precious resources (not to mention portability), and programmers just
haven't adjusted to the new realities.

Architecture Neutral Debugging Format.

Generating native code is not that difficult -- at least it doesn't take much to generate code that will beat an interpreter. Debugging native code, however, can be a real pain (one either has to understand the debugging facilities of both the processor and the OS[0], or one has to understand the debugger's favorite symbol format and provide a decent map to it -- for bass, I was able to provide some simple gdb macros as long as the only register used was a TOS cache, but that broke immediately after providing register allocation)

Bytecode allows the debugger to control execution[1], and while bytecode constructs still have to be mapped back to source, at least the language developer (having done the mapping in the other direction) has an easier, single, job than when going all the way down to the iron, where a more difficult job must be repeated for each platform.

-Dave

:: :: ::

[0] it is somewhat instructive to look at DEBUG.COM (an artifact of 1980's era PCs still shipped with Redmond OS's to this day) for an example of how spartan an IDE can be. Usually one wants an assembler to remember labels, and a debugger to remember breakpoints. DEBUG.COM will turn opcodes into machine language, but you're on your own for labels (leading to a style in which one tries to minimize entry points, turning code into a collection of "loops with tails" and requiring utilities similar to BASIC's RENUM) Similarly, DEBUG.COM will handle all the machine level hassle of setting breakpoints in code, taking the interrupt, and then restoring the original code -- but you type in the list of breakpoints at each step. Once upon a time, people actually developed apps with these bear skins and stone knives.

[1] bytecode also makes "eval" almost as trivial as in assembly. Too much dynamism can get one in trouble, though. Consider the "house of cards" cartoon on p.112 of the Smalltalk green book: "here, let me modify Array At:"
http://www.iam.unibe.ch/~ducasse/FreeBooks/BitsOfHistory/BitsOfHistory.pdf
In that situation, bytecode can be the difference between hosing a process and triple-faulting a box.

The OCaml version is 24 instructions, 8 of which have immediate
constants.  I don't know very much about PowerPC assembly, but let's
suppose that every instruction is 32 bits, including any immediate
constants; that means the whole function weighs 96 bytes.

Using gas or "objdump -D -b binary" would help if you want more accurate numbers. If I remember properly, you're correct about instructions, but not necessarily immediates.

[the MuP21] executes a stream of 5-bit zero-operand two-stack
operations packed into 20-bit words.

For more along these lines, see http://www.jwdt.com/~paysan/b16.html

But these sort of games are better played in hardware than in software -- hardware is great at parallel things (instruction decoding) and pretty weak at serial things, while software is the opposite. (this was a traditional reason for bytecode -- to minimize work for dispatch) FSMs seem to be where the two meet: to add behavior to hardware (think microcode) or to add parallel-pattern-matching to software (think parsers), one tends to wind up synthesizing state machines.

Reply via email to