Maciej Stachowiak wrote:
> We're unlikely to have much interest in working on implementing the RI.
Ok. I'm sorry to hear that, but I understand.
> As for reading the RI, it seems a lot harder to understand than specs
> written in prose. As far as I can tell, only people who have coded
> significant portions understand it.
Fair enough. Comprehensibility is a good part of the measurable value of
a spec, so if it the code prohibits that we are in an undesirable state.
I wonder -- I do not mean to offend here -- if this is partly "sticker
shock" at the initial barrier, which is simply that you have to digest
SML, and you haven't read it before. It is not a terribly hard language
to learn: it consists of value bindings, function expressions,
function-application expressions, case expressions with destructuring
pattern matching, if/then/else expressions, and a very small algebraic
type system (function types, named types, records, disjoint sums, and
sugar for lists).
I also intended to do -- and have gradually been doing -- a conversion
to the simplest possible syntactic forms of SML I could write, unpacking
any syntactic short-hands or dense, idiomatic phrases that might have
turned up during the more intense implementation stages. This is similar
to rewriting english paragraphs for clarity, and is easy work for
someone who speaks the language. Parallelizes easily. I believe this
will help with the legibility significantly: for example, read
evalCondExpr and tell me if it's illegible:
evalCondExpr (regs:Mach.REGS)
(cond:Ast.EXPR)
(thn:Ast.EXPR)
(els:Ast.EXPR)
: Mach.VAL =
let
val v = evalExpr regs cond
val b = toBoolean v
in
if b
then evalExpr regs thn
else evalExpr regs els
end
One of my goals -- which I have surely not achieved yet -- is for most
of the RI to be distilled to this very pedestrian dialect.
> On the one hand, it's useful to have a reference implementation to
> validate what is being done, explore ideas, have something to test and
> compare against, etc.
>
> But yes I think it is an incredibly bad idea for the only specification
> to be a computer program. It's not approachable. I don't think I could
> quickly grok a program of this complexity even in a programming language
> I am familiar with. And by its nature it does not very cleanly partition
> separate concepts. For example, below you pointed me to 6 places in the
> code for "let" statements, and I doubt that reading those functions
> alone will be enough to understand it. So in practice, I don't think
> there is any way to understand "let" in detail without asking you or
> another expert on the RI.
These are ... points I nearly agree with, but not quite, and at the risk
of being terribly long-winded I'd like to air the discussion a bit in
public here, if we can back off from worrying that I'm saying anything
about the schedule of auxiliary-doc-generation (which I've hopefully
addressed in the other email):
First I want to to point out that there is no established "right way" to
publish language specifications. Language specifications range in style
and formalisms employed. People frequently need to study spec, and
implementations, and formal treatments in eg. proof assistants or
reduced semantic models, *and* do impl-to-impl compatibility bakeoffs.
And it still sometimes takes many years, many revisions, to nail down
what people actually agree on or disagree on, what's "in" the language
or "out" of it. Sometimes it takes 5 or 10 years to discover a horrible
unsoundness in the language (or, gasp, that you accidentally made the
type system turing complete!)
No one approach is proven to "work". Not yet.
The AS3 draft spec we were looking at two winters ago had sections
containing pseudo-C++ code, as a way of describing relevant data
structures. ES3 has pseudo-assembly, that has typos and nonsensical
parts in addition to requiring readers to execute goto statements in
their head to understand the flow of a rule. R6RS, for a different
example, shipped most recently with a PLT Redex operational semantic
model to accompany and illuminate it. We considered using PLT Redex too,
and in fact rejected it in part out of the belief (perhaps mistaken!)
that "normal" programmers would find a "normal" language like SML easier
to read than one from the more academic setting of operational semantics
descriptions. Possibly in the future (as the POPLMark challenge is
hoping to establish) a standard metatheory will solidify for semantics
such that machine-checked evaluation rules are no less common than
machine-checked grammars in EBNF. But we're not there yet, so we picked
something that seemed like it might help, and in at least some senses
(see next point) it did.
Second I want to point out that while much of the value of a spec is in
informing/transmitting information from designers to implementors, a
fair portion of the value is also in agreeing/deciding what the various
spec-stakeholders wish, and mean, in their own minds and their own
efforts. I am certain, from recollection, that one of our motivations in
pursuing an RI-focused strategy at all was the fear that we were
producing incoherent ideas: that we all had ideas of what we'd like, but
writing them in english side-by-side (or arguing them across a table)
simply didn't force all the horrifying details of their semantic
incompatibility to manifest. Even if the RI turns out to be a throw-away
artifact -- not useful for the "informing" role of a spec -- I believe
it has helped quite a bit in crystallizing ideas and helping us tinker
toward agreements.
Third I should make clear that IIRC nobody on the committee ever
articulated a belief that the SML would be the "only specification" of
ES4. If anyone did it would have been me, and even I'm not *that*
deluded. We have entertained the notion -- I'll admit to promoting it --
that excerpts of the SML, or some machine-translations of those
excerpts, may wind up constituting part of the normative text, since the
"by hand" expansion of many evaluation rules reads a *lot* like the
machine translation of the simplest SML form. See evalCondExpr above as
example.
The jury is still out on whether that may occur -- some standards bodies
apparently dislike the smell of it, which I find remarkable considering
how many other formalisms (box diagrams, equations, grammars) smell just
fine -- but we all know that no matter what happens to the SML there
will need to be plenty of less-specific accompanying narrative written
at some point. What you and I are discussing now is whether you can
*presently* (rather than "at some point") extract enough of the meaning
you require for early-implementation work from the SML. If you can't, we
need to move up the schedule on some of the accompanying narrative. Fair
enough. Maybe mbedthis got lucky, or have a higher pain tolerance :)
Finally, I think it is unfair to complain that there are 6 places "let"
affects. Programming languages are, as you know, highly integrated and
inter-related affairs. Show me a language spec in *any* formalism that
can get away with treating "one feature" (that is not syntactic sugar)
only once, and never discussing it again. Not likely.
> I asked someone who knows SML to look at it and he found the code pretty
> opaque as well, perhaps due in part to the very terse variable names and
> occasionally obscure concepts. (I still don't understand what a "runtime
> type rib" is, and searching for other references in the file does not
> elucidate, so I'm not sure reading backwards would help.)
I might as well mention what this means in passing, though I surely get
your meaning by now. Extra guidance and terminology can't hurt.
Ribs are lists of (fixture name * fixture) pairs. A rib represents the
set of fixtures that we know *will be* present in a runtime structure,
any time we build it. The instance rib of a class describes its fixed
instance variables. The rib of a function or block describes its fixed
activation variables. The definition phase of the RI lowers everything
"slot-ish" to sets of fixtures arranged into Ast.RIBs, just as the
machine model treats everything "slot-ish" as name->property maps
(Mach.PROP_BINDINGS).
The type system -- in fact the entire AST -- is immutable and discusses
only the plan of execution, not the runtime artifacts. So if you want to
call a type function, you need to provide things like RIBs not things
like PROP_BINDINGS. Type rules wouldn't know what to do with runtime
artifacts like the latter, even if the type rules were being called
*from* runtime. They're *defined over* immutable compile time artifacts.
So the type normalizer -- that part of the type system that converts
type names to type definitions, applies parametric types, and shuffles
structural types around to a normal form -- requires a set of RIBs to do
anything with.
The function you were looking is a support function for the runtime
invocation of the type normalizer. The normalizer can be and is invoked
at compile time too, but this is what happens when you invoke it at
runtime: at takes a runtime scope chain and extracts the type-relevant
ribs the scopes were built from, in order to reconstruct the appropriate
environment for the normalizer.
(This particular aspect of normalization is not required unless you
implement parametric types, in which type environments can be captured
and moved around)
> "let" was meant to be an illustration that it's prohibitively difficult
> to get the needed info without having inside knowledge. If I were to
> generalize this approach to help someone understand another ES4 feature,
> I would probably just say "ask Graydon". Is it truly acceptable to have
> a spec where that's the easiest way to understand it?
Of course not, and insofar as we may have reached that point (I hope we
have not) it would be unsatisfactory to me as well.
-Graydon
_______________________________________________
Es4-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es4-discuss