On 7/26/2011 9:05 AM, David Barbour wrote:
On Tue, Jul 26, 2011 at 1:50 AM, BGB <cr88...@gmail.com
<mailto:cr88...@gmail.com>> wrote:
whether or not compiling to bytecode is itself an actually
effective security measure, it is the commonly expected security
measure.
Is it? I've not heard anyone speak that way in many, many years. I
think people are getting used to JavaScript.
for web-apps maybe, but it will likely be a long time before it becomes
adopted by commercial application software (where the source-code is
commonly regarded as a trade-secret).
a "compiler" may be expected (as part of "the process") even if it
could be technically more correctly called an archiver or similar
(people may not let go of the established process easily).
We can benefit from 'compilers' even if we distribute source. For
example, JavaScript->JavaScript compilers can optimize code, eliminate
dead code, provide static warnings, and so on. We should also be able
to compile other languages into the distribution language. I don't
mind having a compiler be part of 'the process'. The issue regards the
distribution language, not how you reach it.
yes, but why do we need an HLL distribution language, rather than, say,
a low-level distribution language, such as bytecode or a VM-level
ASM-like format, or something resembling Forth or PostScript?...
That said, it would often be preferable to distribute source, and use
a library or module to parse and compile it. This would allow us to
change our implementation without redistributing our intention. A
language with good support for 'staging' would be nice.
potentially.
granted, yes, there are different ways to approach JIT (whether or
not to inline things, blocks vs traces, ...).
Hotspot, too. It is possible to mix interpretation with compilation.
yeah. my present assumed strategy is to assume mixed compilation and
interpretation.
back with my C compiler, I tried to migrate to a pure-compilation
strategy (there was no interpreter, only the JIT). this ultimately
created far more problems than it solved.
the alternative was its direct ancestor, a prior version of my BGBScript
VM, which at the time had used a combined interpreter+JIT strategy
(sadly, for later versions the JIT has broken, as the VM has been too
much in flux and I haven't kept up on keeping it working).
also, depending on language it may matter:
Agreed. We certainly should *design* the distribution language with an
eye on distribution, not just pick an arbitrary language.
yeah. such a language should be capable of a wide range of languages and
semantics.
a basic model which has been working acceptably in my case can be
described roughly as:
sort of like PostScript but also with labels and conditional jumps.
pretty much the entire program representation can be in terms of blocks
and a stack machine.
but, I guess the question that can be made is whether or not the
bytecode is intended to be a stable distribution format (of the
same sort as JBC or CIL), or intended more as a transient format
which may depend somewhat on the currently running VM (and may
change from one version to the next).
We should not tie our users to a particular distribution of the VM. If
you distribute bytecode, or any language, it really should be stable,
so that other people can compete with the implementation.
what I meant may have been misinterpreted.
it could be restated more as: should the bytecode even be used for
program distribution?
if not, then it can be used more internal to the VM and languages
running on the VM, such as for implementing lightweight eval mechanisms
for other languages, ...
hence "currently running VM", basically in this sense meaning "which VM
are we running on right now?". if done well, a program, such as a
language compiler, can target the underlying VM without getting tied too
much into how the VM's IL works, allowing both some level of portability
for the program, as well as reasonably high performance and flexibility
for the VM to change its IL around as-needed (or potentially bypass the
IL and send the code directly to native code).
most likely though, the above would largely boil down to emitting code
via an API.
granted, yes, there are good and bad points to API-driven code generation.
an analogy would be something "sort of like OpenGL, but for compilers".
(side note:
actually, at the moment the thought of an OpenGL-like codegen interface
seems interesting. but I am thinking more in the context of using it as
a means of emitting native code. however, sadly, most of my prior
attempts at separating the codegen from the higher-level IL mechanics,
... have not gone well. ultimately some "structure" may be necessary. )
as for the alternative case, note ARM:
ARM and Thumb machine code are often used as distribution formats.
however, ARM is also fairly model specific, and so code intended for one
processor model may not work on another, and code for an earlier
processor may not work on a later one, ...
yet, in general, it has been doing fairly well market-wise.
(in general I am left with a bit of mixed feelings WRT ARM in general,
although it does a few things well, I would personally still rather live
in a world based on x86...).
or such...
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc