On Wednesday, 19 December 2012 at 08:45:20 UTC, Walter Bright
wrote:
On 12/19/2012 12:19 AM, Max Samukha wrote:
Evidently you've dismissed all of my posts in this thread on
that topic :-)
As you dismissed all points in favor of bytecode.
And I gave detailed reasons why.
Such as it being a
standardized AST representation for multiple languages. CLI is
all about that,
which is reflected in its name. LLVM is used almost
exclusively for that purpose
(clang is great).
My arguments were all based on the idea of distributing
"compiled" source code in bytecode format.
The idea of using some common intermediate format to tie
together multiple front ends and multiple back ends is
something completely different.
And, surprise (!), I've done that, too. The original C compiler
I wrote for many years was a multipass affair, that
communicated the data from one pass to the next via an
intermediate file. I was forced into such a system because DOS
just didn't have enough memory to combine the passes.
I dumped it when more memory became available, as it was the
source of major slowdowns in the compilation process.
Note that such a system need not be *bytecode* at all, it can
just hand the data structure off from one pass to the next. In
fact, an actual bytecode requires a serialization of the data
structures and then a reconstruction of them - rather pointless.
I understand that but can not fully agree. The problem is the
components of such a system are distributed and not
binary-compatible. The data structures are intended to be
transferred over a stream and you *have* to serialize at one end
and deserialize at the the other.
For example, we serialize a D host app and a C library into
portable pnacl bitcode and transfer it to Chrome for compilation
and execution. There is no point in having C, D (or whatever
other languages people are going to invent) front-ends on the
receiving side. The same applies to JS - people "serialize" ASTs
generated from, say, a CoffeeScript source into JS, transfer that
to the browser, which "deserializes" JS into an internal AST
representation.
Note that I am not arguing that bytecode is the best kind of
standard AST representation. I am arguing that there *is* a point
in such serialized representation. Hence, your claim that ILs are
*completely* useless is not quite convincing.
When we have a single God language (I wouldn't object if it were
D but it is not yet ;)), then there would be no need in
complications like ILs.
Not advocating bytecode here but you claiming it is completely
useless is so
D-ish :).
I'm not without experience doing everything bytecode is
allegedly good at.
I am not doubting your experience but that might be an
authoritative argument.
As for CLI, it is great for implementing C#. For other
languages, not so much. There turned out to be no way to
efficiently represent D slices in it, for example.
That is the limitation of CLI, not the concept. LLVM does not
have that problem.