Re: OCaml vs. SBCL, and various other interpreters

Dave Long Mon, 12 Mar 2007 07:58:39 -0800

Bytecode interpreters can offer unsurpassed code compactness, ...[Thisbox has] only 2MiB of L2 cache, and presumably something like
16-64KiB of L1 cache.  Thrashing the cache is soundly punished.

One problem, as you point out earlier, is that the question is not size ofnative code loop vs. size of bytecoded loop, but size of native loop vs.size of (bytecoded loop + its working path through the bytecodeinterpreter).

I haven't used MSVC in ages, but their compiler used to have the option tocompile to a bytecode -- still, I think this was only ever useful forspace, not for speed. (for that, one must try to arrange thefunctionality to ensure only the outermost loops are being interpreted)

I know of no other reason to implement an interpreter using bytecode.

So I'm surprised it's such a popular thing to do!  I think the reason
is probably that code space and compilation time used to be quite
precious resources (not to mention portability), and programmers just
haven't adjusted to the new realities.


Architecture Neutral Debugging Format.

Generating native code is not that difficult -- at least it doesn't takemuch to generate code that will beat an interpreter. Debugging nativecode, however, can be a real pain (one either has to understand thedebugging facilities of both the processor and the OS[0], or one has tounderstand the debugger's favorite symbol format and provide a decent mapto it -- for bass, I was able to provide some simple gdb macros as long asthe only register used was a TOS cache, but that broke immediately afterproviding register allocation)

Bytecode allows the debugger to control execution[1], and while bytecodeconstructs still have to be mapped back to source, at least the languagedeveloper (having done the mapping in the other direction) has an easier,single, job than when going all the way down to the iron, where a moredifficult job must be repeated for each platform.


-Dave

:: :: ::

[0] it is somewhat instructive to look at DEBUG.COM (an artifact of 1980'sera PCs still shipped with Redmond OS's to this day) for an example of howspartan an IDE can be. Usually one wants an assembler to remember labels,and a debugger to remember breakpoints. DEBUG.COM will turn opcodes intomachine language, but you're on your own for labels (leading to a style inwhich one tries to minimize entry points, turning code into a collectionof "loops with tails" and requiring utilities similar to BASIC's RENUM)Similarly, DEBUG.COM will handle all the machine level hassle of settingbreakpoints in code, taking the interrupt, and then restoring the originalcode -- but you type in the list of breakpoints at each step. Once upon atime, people actually developed apps with these bear skins and stoneknives.

[1] bytecode also makes "eval" almost as trivial as in assembly. Too muchdynamism can get one in trouble, though. Consider the "house of cards"cartoon on p.112 of the Smalltalk green book: "here, let me modify ArrayAt:"

http://www.iam.unibe.ch/~ducasse/FreeBooks/BitsOfHistory/BitsOfHistory.pdf

In that situation, bytecode can be the difference between hosing a processand triple-faulting a box.

The OCaml version is 24 instructions, 8 of which have immediate
constants.  I don't know very much about PowerPC assembly, but let's
suppose that every instruction is 32 bits, including any immediate
constants; that means the whole function weighs 96 bytes.

Using gas or "objdump -D -b binary" would help if you want more accuratenumbers. If I remember properly, you're correct about instructions, butnot necessarily immediates.

[the MuP21] executes a stream of 5-bit zero-operand two-stack
operations packed into 20-bit words.


For more along these lines, see http://www.jwdt.com/~paysan/b16.html

But these sort of games are better played in hardware than in software --hardware is great at parallel things (instruction decoding) and prettyweak at serial things, while software is the opposite. (this was atraditional reason for bytecode -- to minimize work for dispatch) FSMsseem to be where the two meet: to add behavior to hardware (thinkmicrocode) or to add parallel-pattern-matching to software (thinkparsers), one tends to wind up synthesizing state machines.

Re: OCaml vs. SBCL, and various other interpreters

Reply via email to