On 7/26/2011 8:34 PM, David Barbour wrote:
On Tue, Jul 26, 2011 at 3:28 PM, BGB <cr88...@gmail.com
<mailto:cr88...@gmail.com>> wrote:
why do we need an HLL distribution language, rather than, say, a
low-level distribution language, such as bytecode or a VM-level
ASM-like format, or something resembling Forth or PostScript?...
Because:
(1) Code will often adapt to relatively 'static' conditions indicated
by the host (such as user-agent, screen size, or access to a 'tilt'
sensor). In these cases, higher-level code is often much easier to
specialize and garbage-collect than an opaque block of bytecode.
one can support ifdef blocks in the IL, no real problem there.
my own language does something like this with a form like:
$[ifdef(FOO)] {
...
}
but a language could be designed to allow it with a more traditional
syntax, say:
#ifdef FOO
...
#endif
probably with the '...' code being folded off into its own code block.
technically, this requires even nesting and disallows goto into/out-of
the block, but this seems like a good enough strategy.
potentially, I could upgrade ifdef to a first-class syntactic form, say:
ifdef(FOO) { ... }
but, there are other issues along this road.
(2) Unnecessarily powerful languages, such as Forth or Postscript, are
difficult to reason about, difficult to audit. We are forced to stick
them in 'sandboxes', and the extra memory barriers and copying
overheads slow them down, and consume more resources. If we design our
HLL for secure composition and provable (or semantically controllable)
resource consumption, we can optimize it in-place, as though it were a
native part of our application. One valid fear from code distribution
- even secure code - is that it will consume too many resources (CPU,
memory, bandwidth). We can accommodate that concern in the sandbox
with a low-level language, but not /gracefully/ - i.e. if we want
graceful failures, we need semantics that support it, such as clear
disruption semantics.
Forth and PostScript are not "unnecessarily powerful".
probably, if the HLL is somewhere along C or C++ lines (with pointers
and OOP and so on), then the IL being powerful is not the issue. one
will still need validation logic to make sure the pointers/... are not
used in an unsafe manner (probably inserting runtime checks whenever
operations can't be determined to be safe).
even rather gimped languages, like Java or C#, still have some of these
issues.
a language much weaker than these is probably too weak to be really
usable in any non-trivial context.
(3) Most interesting code is not 'algorithmic'. Dataflow models can be
optimized considerably with some access to the relevant local
structure; decisions on caching, propagation, and parallelism, for
example, should wait until the code is in place. Thus, libraries and
modules for dataflow languages must provide a higher-level structure
to the compiler, and compilation should happen /after/ linking. In
open systems, linking is dynamic, and so must be compilation.
dataflow languages currently lack mainstream acceptance as
application-development languages.
Procedural + OOP is probably a much better bet.
(4) The web model today is /extremely/ constrained: code distribution
is single-origin, server-to-client. We want to scale deeper, wider.
These things will happen: Clients will compose code from multiple
services. Services will receive agents and ambassadors from clients.
Services will broker clients, who will then speak directly. Services,
themselves, are clients to their dependencies and will thus repeat all
these processes. HLL code offers two critical features: (a) secure
composition, and (b) the ability to agglomerate resulting
compositions, identify relationships between dependencies, then
shatter and distribute shards closer to the appropriate resources.
Static decisions about code can easily introduce orders of magnitude
in bandwidth and latency inefficiency.
can't really make sense of the above.
(5) Higher level code is much more accessible to humans. For
developers, it is easier to debug at the site where the problem
occurred (no painful shipping 'stack dumps' around). It is easier to
extend or transform the code, e.g. using GreaseMonkey scripts or
Chrome extensions. It is easier to learn to code. Children can peek
behind the curtain - begin to understand and manipulate the sea of
computation in which they live.
doesn't matter much for applications, as you don't generally want the
user to know how it works.
normally, the program is expected to be a sort of sealed black-box for
the creators' eyes only.
granted, this is not to say that FOSS people/... can't distribute in
source form, but not everyone should *have* to distribute in source form.
it is like, the children peek behind the curtain, start messing with
things, ..., and find themselves in the world of copyright infringement
and IP law, and/or find themselves or their parents being faced with a
lawsuit as a result.
keeping proprietary code hidden away thus also serves the purpose of
helping to prevent these "innocent little children" from unintentionally
committing criminal acts, ...
From every viewpoint I subscribe to - performance, security,
scalability, flexibility, user rights - <b>bytecode is a BAD idea</b>
as a distribution language. Other low-level languages are similarly bad.
Any good distribution language /will be/ high level, though not
/because/ it's high level. There are quite a few desiderata for a web
language.
however, a lower-level language will be more abstracted from the
high-level language, and more opaque-looking for prying eyes (likely
more important for commercial people, one wants the code as difficult to
get at as can reasonably be done).
hence, for example, if one gets an EXE file, most of what is going on in
there is fairly well hidden. one can try to disassemble it and even this
will often fail to work correctly. then one can keep their various
algorithms, ... hidden.
CIL and JBC are a bit less secure, but theoretically still work ok, and
there are 3rd party obfuscator tools which are commonly used in
commercial situations.
the main merit of bytecode over native code in these cases is that
bytecode tends to be more portable, but there is often a drawback that
bytecode based distribution formats may hinder what one can do in the
languages, or the types of languages which can be used.
for example, JVM / JBC isn't really suitable for use with C or C++
programs (the bytecode is just too limited).
.NET / CIL allows C and C++, but at a cost of it only working on MS targets.
also, VM dependencies are an awkward issue (as one finds they need a
pile of assorted VMs, none of which really plays well with the others).
granted, a lot depends on what one wants out of a VM.
an example would be if one wanted, say, a more free and open platform
along similar lines to Adobe Flash or Microsoft Silverlight, rather than
seeing a VM as a way to promote/enforce certain development and
distribution methodologies, or for that matter, certain languages.
granted, if the VM will also support loading programs directly from
source code, then this is good as well, as then one has options.
or such...
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc