On 7/26/2011 12:29 AM, David Barbour wrote:


On Mon, Jul 25, 2011 at 11:16 PM, BGB <cr88...@gmail.com <mailto:cr88...@gmail.com>> wrote:

    well, there are pros and cons.

    pros:
    more compact;
    better at hiding ones' source code (decompilers are necessary);
    can be executed directly if using an interpreter (no parser/...
    needed);
    ...


Counters:
* We can minify source, or zip it. There are tools that do this for JavaScript. * Hiding code is usually a bad thing. A pretense of security is always a bad thing. But, if someone were to insist on an equal capability, I'd point them at an 'obfuscator' (such as http://javascriptobfuscator.com/default.aspx). A tool dedicated to befuddling your users will do a much better job in this role than a simple bytecode compiler. * we rarely execute bytecode directly; there is a lot of non-trivial setup for linking and making sure we call the right codes.


typically, deflate+bytecode works a little better than deflate+source.
either way, yes, code usually compresses down fairly well.

however, whether or not compiling to bytecode is itself an actually effective security measure, it is the commonly expected security measure.

also, many people just expect to distribute their programs as a precompiled binaries (rather than, say, as a glorified ZIP of source-files or similar).

a "compiler" may be expected (as part of "the process") even if it could be technically more correctly called an archiver or similar (people may not let go of the established process easily).


Besides, the real performance benefits come from compiling the code - even bytecode is typically JIT'd. Higher-level source can allow more effective optimizations, especially across library boundaries. We'll want to cache and reuse the compiled code, in a format suitable for immediate execution. JavaScript did a poor job here due to its lack of a module system (to prevent name shadowing and such), but they're fixing that for ES.next.

it depends on if/when the JIT is done.

for example, if a given bytecode block needs to be execute, say, 1000 or 10000 instructions before the JIT is triggered on it (otherwise, an interpreter is used), then generally the JIT issue is less of an issue, as there may be much code (in libraries, ...) which never gets run through the JIT (or potentially even ever executed).

granted, yes, there are different ways to approach JIT (whether or not to inline things, blocks vs traces, ...).



    the main merit of a bytecode format is that it could shorten the
    path in getting to native code, potentially allowing it to be faster.


Well, it is true that one might save a few cycles for a straightforward conversion.


well, it may matter depending some on the amount of work done by the JIT.

for example, a fixed-form single-pass or 2-pass translator (generally using direct procedural logic to emit machine code) may be comparably faster than one which uses more elaborate transformations (such internally converting to SSA form or using a multi-stage conversion ...).


also, depending on language it may matter:
for example, in my C compiler, the majority of the running time was actually used up by the preprocessor and parser (mostly the fault of headers), and so a bytecode-based format would make a good deal more sense.

for ECMAScript-family languages (my own BGBScript language could be included here) it is less of an issue, since there are no headers and the syntax is relatively straightforward to parse quickly (even without micro-optimizing the parser).


The use of a private IL by a compiler isn't the same. You aren't forced to stabilize a private IL the way you need to stabilize the JVM ops.


yes, fair enough...

but, I guess the question that can be made is whether or not the bytecode is intended to be a stable distribution format (of the same sort as JBC or CIL), or intended more as a transient format which may depend somewhat on the currently running VM (and may change from one version to the next).

there may be room for a VM to partly expose an unstable IL, potentially with APIs and conventions in place to reduce the risk of code breaking due to internal changes to the IL.


_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to