On 7/26/2011 8:34 PM, David Barbour wrote:
On Tue, Jul 26, 2011 at 3:28 PM, BGB <cr88...@gmail.com <mailto:cr88...@gmail.com>> wrote:

    why do we need an HLL distribution language, rather than, say, a
    low-level distribution language, such as bytecode or a VM-level
    ASM-like format, or something resembling Forth or PostScript?...


Because:

(1) Code will often adapt to relatively 'static' conditions indicated by the host (such as user-agent, screen size, or access to a 'tilt' sensor). In these cases, higher-level code is often much easier to specialize and garbage-collect than an opaque block of bytecode.


one can support ifdef blocks in the IL, no real problem there.

my own language does something like this with a form like:
$[ifdef(FOO)] {
...
}

but a language could be designed to allow it with a more traditional syntax, say:
#ifdef FOO
...
#endif

probably with the '...' code being folded off into its own code block. technically, this requires even nesting and disallows goto into/out-of the block, but this seems like a good enough strategy.


potentially, I could upgrade ifdef to a first-class syntactic form, say:
ifdef(FOO) { ... }

but, there are other issues along this road.


(2) Unnecessarily powerful languages, such as Forth or Postscript, are difficult to reason about, difficult to audit. We are forced to stick them in 'sandboxes', and the extra memory barriers and copying overheads slow them down, and consume more resources. If we design our HLL for secure composition and provable (or semantically controllable) resource consumption, we can optimize it in-place, as though it were a native part of our application. One valid fear from code distribution - even secure code - is that it will consume too many resources (CPU, memory, bandwidth). We can accommodate that concern in the sandbox with a low-level language, but not /gracefully/ - i.e. if we want graceful failures, we need semantics that support it, such as clear disruption semantics.


Forth and PostScript are not "unnecessarily powerful".

probably, if the HLL is somewhere along C or C++ lines (with pointers and OOP and so on), then the IL being powerful is not the issue. one will still need validation logic to make sure the pointers/... are not used in an unsafe manner (probably inserting runtime checks whenever operations can't be determined to be safe).

even rather gimped languages, like Java or C#, still have some of these issues.

a language much weaker than these is probably too weak to be really usable in any non-trivial context.


(3) Most interesting code is not 'algorithmic'. Dataflow models can be optimized considerably with some access to the relevant local structure; decisions on caching, propagation, and parallelism, for example, should wait until the code is in place. Thus, libraries and modules for dataflow languages must provide a higher-level structure to the compiler, and compilation should happen /after/ linking. In open systems, linking is dynamic, and so must be compilation.

dataflow languages currently lack mainstream acceptance as application-development languages.

Procedural + OOP is probably a much better bet.


(4) The web model today is /extremely/ constrained: code distribution is single-origin, server-to-client. We want to scale deeper, wider. These things will happen: Clients will compose code from multiple services. Services will receive agents and ambassadors from clients. Services will broker clients, who will then speak directly. Services, themselves, are clients to their dependencies and will thus repeat all these processes. HLL code offers two critical features: (a) secure composition, and (b) the ability to agglomerate resulting compositions, identify relationships between dependencies, then shatter and distribute shards closer to the appropriate resources. Static decisions about code can easily introduce orders of magnitude in bandwidth and latency inefficiency.


can't really make sense of the above.


(5) Higher level code is much more accessible to humans. For developers, it is easier to debug at the site where the problem occurred (no painful shipping 'stack dumps' around). It is easier to extend or transform the code, e.g. using GreaseMonkey scripts or Chrome extensions. It is easier to learn to code. Children can peek behind the curtain - begin to understand and manipulate the sea of computation in which they live.


doesn't matter much for applications, as you don't generally want the user to know how it works. normally, the program is expected to be a sort of sealed black-box for the creators' eyes only.

granted, this is not to say that FOSS people/... can't distribute in source form, but not everyone should *have* to distribute in source form.


it is like, the children peek behind the curtain, start messing with things, ..., and find themselves in the world of copyright infringement and IP law, and/or find themselves or their parents being faced with a lawsuit as a result.

keeping proprietary code hidden away thus also serves the purpose of helping to prevent these "innocent little children" from unintentionally committing criminal acts, ...



From every viewpoint I subscribe to - performance, security, scalability, flexibility, user rights - <b>bytecode is a BAD idea</b> as a distribution language. Other low-level languages are similarly bad.

Any good distribution language /will be/ high level, though not /because/ it's high level. There are quite a few desiderata for a web language.


however, a lower-level language will be more abstracted from the high-level language, and more opaque-looking for prying eyes (likely more important for commercial people, one wants the code as difficult to get at as can reasonably be done).

hence, for example, if one gets an EXE file, most of what is going on in there is fairly well hidden. one can try to disassemble it and even this will often fail to work correctly. then one can keep their various algorithms, ... hidden.

CIL and JBC are a bit less secure, but theoretically still work ok, and there are 3rd party obfuscator tools which are commonly used in commercial situations.

the main merit of bytecode over native code in these cases is that bytecode tends to be more portable, but there is often a drawback that bytecode based distribution formats may hinder what one can do in the languages, or the types of languages which can be used.

for example, JVM / JBC isn't really suitable for use with C or C++ programs (the bytecode is just too limited).

.NET / CIL allows C and C++, but at a cost of it only working on MS targets.


also, VM dependencies are an awkward issue (as one finds they need a pile of assorted VMs, none of which really plays well with the others).


granted, a lot depends on what one wants out of a VM.

an example would be if one wanted, say, a more free and open platform along similar lines to Adobe Flash or Microsoft Silverlight, rather than seeing a VM as a way to promote/enforce certain development and distribution methodologies, or for that matter, certain languages.

granted, if the VM will also support loading programs directly from source code, then this is good as well, as then one has options.


or such...


_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to