On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote:
> - there is no rule-of-thumb understanding (programmers
> have to look up or memorize all restricted productions,
Here is a quibble: there is a rule, or set of rules enumerated by restricted
productions.
So indeed, viewed production by production, there are too many rules to
memorize, whether concrete or abstracted only a little bit (e.g., *either*
break or continue with label is restricted to have [no LineTerminator here]
between the keyword and the label).
However, one can abstract further by thinking about all the goto-like forms
being restricted (continue, break, return, and for unclear reasons since the
expression is not optional, throw).
This does not cover postfix ++/--, so two rules. Not quite as bad as you wrote.
The bigger problem is not the rule-space but the mixed significance and
insignificance of line terminators.
What we observe is that programmers come to expect ASI where there is no ASI.
So, e.g.,
foo
(bar, baz);
involves no ASI (no error to correct, no restricted production), but it looks
like two statements. The line terminator having selective meaning due to ASI as
an error correction procedure, and of course in restricted productions, creates
an expectation that line terminators matter in general.
However, going down that road leads to CoffeeScript, or (somewhat more
conservatively due to the use of : at end of head forms, and a lot older)
Python. It's a steep slippery slope.
> For comparison, if we take Haskell's semicolon insertion (HSI),
> and throw out all the special cases not needed for Javascript,
> the rules are simple, predictable, and fully under programmer
> control (rather than grammar author control):
>
> 1 semicolon insertion happens for syntax involving blocks
> (always preceded by some keyword):
> <keyword> { .. ; .. ; .. }
>
> 2 if the opening brace following such a <keyword> is
> omitted, the start-column of the next token establishes
> a baseline for automatic semicolon/end brace insertion
>
> 3 following lines beginning with a non-white token that is
> - indented more: continue the preceding statement
This is going to enrage CoffeeScripters and Pythonistas, and with good reason.
They'll want a new block (no leading keyword required), not continuation of the
preceding line. Especially with let, const, and block-local functions.
I'm not saying this is "bad" or "good" on balance. Indeed, it beats having to
\-escape a newline to continue an overlong statement in Python.
But it is different yet again from nearby or layered languages, and of course
it's not ASI as JS has had since forever.
> - indented equally: start a new statement in the block
This satisfies the expectation created by the selective meaning of line
terminators.
> - indented less: end the block
This is great if you buy the rest.
> Predictability (both in reading and in writing code) and the
> reduction of syntax noise soon won me over. Still, it is useful
> to have the option of no HSI interference, if one generates
> code with a simple tool, if one wants to make all inserted {;}
> explicit, or when whitespace is messed with (emails).
> Also, explicit and implicit style can be combined.
Good points. The keystroke tax with ; and {} in JS is an issue. It is a tax at
the margin on all effort creating and maintaining source, though, so I expect
over time lower-tax syntaxes to win. The trick is migrating JS code into a new
standardized edition without creating new runtime errors by failing to catch
migration errors.
> Also, some of the pessimism surrounding ASI reform [3]
> stems from the limitations of current spec tools, such as
> restricted productions, so looking at other ways to insert
> semicolons might help.
Reformulating the spec could be done but we would need to keep the "ES5" or
"classic" ASI spec around. Spec complexity and the opportunity cost of the work
to increase it in this area will hurt -- probably a lot.
Anyway, we'd need a more complete strawman spec to evaluate, to get further.
> It is interesting that even the ES5 spec has no convincing
> ASI examples, only clarifying examples (7.9.2).
That is the same old language from ES1, IIRC.
> And blog
> posts seem to be more about trouble with ASI than about
> usefulness of ASI [2,4,5].
Beware negativity and confirmation biases here.
ASI is relied on by tons of content, without complaints (or praise). It goes
without notice when it works, which is often when a ; was left off where the
formal grammar requires it.
> So ASI as it stands in Javascript
> now does not only make life harder for programmers but
> for spec writers (and readers), too.
The spec didn't change, so we are riding on that sunk cost.
It's easy to exaggerate here, but it seems to me the big deal is not ASI costs
already sunk. Rather it is how to lighten the syntax and make ASI more usable,
in a new edition.
> It would be useful to know examples of ASI working well
> for someone. Then one could check whether the benefits
> could be achieved by alternate rules, while reducing the
> danger that programmer and compiler have different
> interpretations of the same code.
We know ASI is used, you can log SpiderMonkey code to see where it kicks in. I
haven't done this lately and I'm not going to attempt any kind of "meaningful"
web JS survey, but we don't need to, in my view. What we could use is some
validated alternative that has no runtime migration error gotchas.
> HSI would need some tweaking to be suitable for Javascript
> coding styles (though some tweaks could be copied from
> Haskell, I just omitted them to bring out the core ideas).
I think your effort developing a draft spec would be great. Just citing Haskell
or talking about the issues more generally is not going to move the mountain
that needs to move here.
> Still, such a variant might work better and be easier to
> understand than the current ASI. Equally important, a
> transition might be doable as incremental improvements
> rather than a radically different system.
While new Harmony proposals will be prototyped and even shipped before the next
edition, it takes years to do a new edition. So we do not have the luxury of
many *standardized* incremental improvements.
Further, standarding increments along an uncertain path creates bad path
dependence, as content comes to depend on each increment in turn, until you may
well be painted into a corner.
> For instance, one
> could weaken the no-line-break-here token to consider
> line-break plus indentation.
We tried this back at the July 2008 "Harmony" (Oslo) meeting. The issue was
function foo(x) {
if (x)
return
"a very long string here that did not fit on the previous line";
return "short";
}
Any code of this form today is probably a bug: the programmer forgot about
return's production being restricted, and created a dead (unreachable) and
useless string literal expression statement.
However, we did not want to require the kind of analysis that would be needed
to distinguish that from this:
function foo(x) {
if (x)
return
some_long_and_complex(expressions(), with(effects()));
return simple();
}
In this case it is *not* safe to assume (per HSI) that the line after the
return is a continuation of the return statement.
There is mis-indented JS on the web, including of this form, and what such code
means (whatever was intended) is now a compatibility constraint.
> One might drop the
> error-correction bits if indentation provides alternative
> control.
The formal grammar currently requires ; as statement terminator, so ASI is an
error correction procedure (ignoring restricted productions). That is a
pragmatic decision (in my code in 1995, and in the spec's choice of
formalisms). It could be revisited, but the devil is in the details, and it is
costly work.
> In the spirit of refactoring languages in small steps,
> improving ASI might be more manageable than removing
> it (and if ASI is worth doing, it is worth doing it well).
ASI is not going to be removed. I don't know why you think it could be.
Again, we don't get standardized small steps. We need something more than
user-tested single-source new parser code (a la CoffeeScript, which I admire --
just saying we can't standardize anything like its lexer/disambiguator/parser
code). We need at least a "HSI for JS" spec with more details than in your
message, and (especially) careful analysis of how migration would work.
I'm still skeptical this is (a) doable with only early errors when migrating;
(b) worth the up-front and ongoing costs, since we will need to keep ASI
spec'ed forever (for web compatibility of not-opted-into-Harmony JS).
But since you wrote a nice post and seem motivated, I do want to encourage you
to work out more details. We don't need more motivation or (dubious IMHO)
attitudinizing about ASI :-/.
/be
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss