Re: Automatic Semicolon Insertion: value vs cost; predictability and control; alternatives

Brendan Eich Sun, 17 Apr 2011 04:14:21 -0700

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote:

> - there is no rule-of-thumb understanding (programmers
>  have to look up or memorize all restricted productions,


Here is a quibble: there is a rule, or set of rules enumerated by restricted 
productions.

So indeed, viewed production by production, there are too many rules to 
memorize, whether concrete or abstracted only a little bit (e.g., *either* 
break or continue with label is restricted to have [no LineTerminator here] 
between the keyword and the label).

However, one can abstract further by thinking about all the goto-like forms 
being restricted (continue, break, return, and for unclear reasons since the 
expression is not optional, throw).

This does not cover postfix ++/--, so two rules. Not quite as bad as you wrote.

The bigger problem is not the rule-space but the mixed significance and 
insignificance of line terminators.

What we observe is that programmers come to expect ASI where there is no ASI. 
So, e.g.,

 foo
 (bar, baz);

involves no ASI (no error to correct, no restricted production), but it looks 
like two statements. The line terminator having selective meaning due to ASI as 
an error correction procedure, and of course in restricted productions, creates 
an expectation that line terminators matter in general.

However, going down that road leads to CoffeeScript, or (somewhat more 
conservatively due to the use of : at end of head forms, and a lot older) 
Python. It's a steep slippery slope.


> For comparison, if we take Haskell's semicolon insertion (HSI),
> and throw out all the special cases not needed for Javascript,
> the rules are simple, predictable, and fully under programmer
> control (rather than grammar author control):
> 
>  1 semicolon insertion happens for syntax involving blocks
>      (always preceded by some keyword):
>      <keyword> { .. ; .. ; .. }
> 
>  2 if the opening brace following such a <keyword> is
>      omitted, the start-column of the next token establishes
>      a baseline for automatic semicolon/end brace insertion
> 
>  3 following lines beginning with a non-white token that is
>      - indented more: continue the preceding statement

This is going to enrage CoffeeScripters and Pythonistas, and with good reason. 
They'll want a new block (no leading keyword required), not continuation of the 
preceding line. Especially with let, const, and block-local functions.

I'm not saying this is "bad" or "good" on balance. Indeed, it beats having to 
\-escape a newline to continue an overlong statement in Python.

But it is different yet again from nearby or layered languages, and of course 
it's not ASI as JS has had since forever.


>      - indented equally: start a new statement in the block

This satisfies the expectation created by the selective meaning of line 
terminators.


>      - indented less: end the block

This is great if you buy the rest.


> Predictability (both in reading and in writing code) and the
> reduction of syntax noise soon won me over. Still, it is useful
> to have the option of no HSI interference, if one generates
> code with a simple tool, if one wants to make all inserted {;}
> explicit, or when whitespace is messed with (emails).
> Also, explicit and implicit style can be combined.

Good points. The keystroke tax with ; and {} in JS is an issue. It is a tax at 
the margin on all effort creating and maintaining source, though, so I expect 
over time lower-tax syntaxes to win. The trick is migrating JS code into a new 
standardized edition without creating new runtime errors by failing to catch 
migration errors.


> Also, some of the pessimism surrounding ASI reform [3]
> stems from the limitations of current spec tools, such as
> restricted productions, so looking at other ways to insert
> semicolons might help.

Reformulating the spec could be done but we would need to keep the "ES5" or 
"classic" ASI spec around. Spec complexity and the opportunity cost of the work 
to increase it in this area will hurt -- probably a lot.

Anyway, we'd need a more complete strawman spec to evaluate, to get further.


> It is interesting that even the ES5 spec has no convincing
> ASI examples, only clarifying examples (7.9.2).

That is the same old language from ES1, IIRC.


> And blog
> posts seem to be more about trouble with ASI than about
> usefulness of ASI [2,4,5].

Beware negativity and confirmation biases here.

ASI is relied on by tons of content, without complaints (or praise). It goes 
without notice when it works, which is often when a ; was left off where the 
formal grammar requires it.


> So ASI as it stands in Javascript
> now does not only make life harder for programmers but
> for spec writers (and readers), too.

The spec didn't change, so we are riding on that sunk cost.

It's easy to exaggerate here, but it seems to me the big deal is not ASI costs 
already sunk. Rather it is how to lighten the syntax and make ASI more usable, 
in a new edition.


> It would be useful to know examples of ASI working well
> for someone. Then one could check whether the benefits
> could be achieved by alternate rules, while reducing the
> danger that programmer and compiler have different
> interpretations of the same code.

We know ASI is used, you can log SpiderMonkey code to see where it kicks in. I 
haven't done this lately and I'm not going to attempt any kind of "meaningful" 
web JS survey, but we don't need to, in my view. What we could use is some 
validated alternative that has no runtime migration error gotchas.


> HSI would need some tweaking to be suitable for Javascript
> coding styles (though some tweaks could be copied from
> Haskell, I just omitted them to bring out the core ideas).

I think your effort developing a draft spec would be great. Just citing Haskell 
or talking about the issues more generally is not going to move the mountain 
that needs to move here.


> Still, such a variant might work better and be easier to
> understand than the current ASI. Equally important, a
> transition might be doable as incremental improvements
> rather than a radically different system.

While new Harmony proposals will be prototyped and even shipped before the next 
edition, it takes years to do a new edition. So we do not have the luxury of 
many *standardized* incremental improvements.

Further, standarding increments along an uncertain path creates bad path 
dependence, as content comes to depend on each increment in turn, until you may 
well be painted into a corner.


> For instance, one
> could weaken the no-line-break-here token to consider
> line-break plus indentation.

We tried this back at the July 2008 "Harmony" (Oslo) meeting. The issue was

 function foo(x) {
   if (x)
     return
       "a very long string here that did not fit on the previous line";
   return "short";
 }

Any code of this form today is probably a bug: the programmer forgot about 
return's production being restricted, and created a dead (unreachable) and 
useless string literal expression statement.

However, we did not want to require the kind of analysis that would be needed 
to distinguish that from this:

 function foo(x) {
   if (x)
     return
       some_long_and_complex(expressions(), with(effects()));
   return simple();
 }

In this case it is *not* safe to assume (per HSI) that the line after the 
return is a continuation of the return statement.

There is mis-indented JS on the web, including of this form, and what such code 
means (whatever was intended) is now a compatibility constraint.


> One might drop the
> error-correction bits if indentation provides alternative
> control.

The formal grammar currently requires ; as statement terminator, so ASI is an 
error correction procedure (ignoring restricted productions). That is a 
pragmatic decision (in my code in 1995, and in the spec's choice of 
formalisms). It could be revisited, but the devil is in the details, and it is 
costly work.


> In the spirit of refactoring languages in small steps,
> improving ASI might be more manageable than removing
> it (and if ASI is worth doing, it is worth doing it well).

ASI is not going to be removed. I don't know why you think it could be.

Again, we don't get standardized small steps. We need something more than 
user-tested single-source new parser code (a la CoffeeScript, which I admire -- 
just saying we can't standardize anything like its lexer/disambiguator/parser 
code). We need at least a "HSI for JS" spec with more details than in your 
message, and (especially) careful analysis of how migration would work.

I'm still skeptical this is (a) doable with only early errors when migrating; 
(b) worth the up-front and ongoing costs, since we will need to keep ASI 
spec'ed forever (for web compatibility of not-opted-into-Harmony JS).

But since you wrote a nice post and seem motivated, I do want to encourage you 
to work out more details. We don't need more motivation or (dubious IMHO) 
attitudinizing about ASI :-/.

/be
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Automatic Semicolon Insertion: value vs cost; predictability and control; alternatives

Reply via email to