Automatic Semicolon Insertion: value vs cost; predictability and control; alternatives

Claus Reinke Sun, 17 Apr 2011 02:51:06 -0700

The idea of ASI seems to be to reduce syntactic clutter, possibly
making programs more readable, which is a laudable goal. But
if the reduction in symbol noise comes at the cost of a rise in
complexity of error-prone interpretation, that actually reduces
readability. And few things frustrate programmers more than
not being able to predict or control what happens to their code
(subjectively, not objectively: preferably without detailed spec).


That is by no means a new problem [2,3], but a solution seems
hard to come by, as summarized by Brendan in [3]:

   Just the emotion around ASI makes me want to reach for
   greater clarity and (if possible) improvements down the line.
   But yeah, it's low priority and the risk for reward looks high.

Javascript is not the only language with some form of semicolon
insertion, and programmer satisfaction with this feature seems
to vary widely across languages, suggesting that implementations
of ASI differ as well.

These differences are important - experiences with related, but
different features in different languages should inform, not bias
discussion of ASI in Javascript. and different tools that achieve
similar goals might offer additional design options.

So, I thought it might be helpful to contrast Javascript's approach
with one that does not seem to stir up so much negative emotion.

// Javascript semicolon insertion (ASI)

There are several aspects of Javascript ASI that I find worrying:

- ASI is triggered by linebreaks
- ASI depends on error correction
- ASI depends on restricted productions

Taken together, this reduces both predictability of and control
over semicolon insertion:

- there is no rule-of-thumb understanding (programmers
   have to look up or memorize all restricted productions,
   and notice all errors in their code, to the extent that both
   of these control ASI; if they miss an error that triggers
   ASI, the code they are looking at is not the same code
   that the JS implementation sees, hindering debugging)

- there is little programmer control (programmers can add
   linebreaks, but that alone isn't sufficient; only combining
   linebreaks with restricted productions or correctable
   errors will result in ASI; programmers sometimes add
   linebreaks, not knowing that or when this will invoke
   ASI, assuming that linebreaks are just whitespace;
   programmers sometimes omit semicolons, erroneously
   assuming that ASI will fix it)

[Btw, it would be great if one could write something like
/*OPTIONS: warn-ASI */ and get parser warnings whenever
ASI kicks in (during development). It might be useful for ES
to standardize the idea of such pragmas (passing hints and
options to tools, in source comments), since they are already
in use (eg, jslint).]

// Haskell semicolon insertion (HSI)

For comparison, if we take Haskell's semicolon insertion (HSI),
and throw out all the special cases not needed for Javascript,
the rules are simple, predictable, and fully under programmer
control (rather than grammar author control):

   1 semicolon insertion happens for syntax involving blocks
       (always preceded by some keyword):
       <keyword> { .. ; .. ; .. }

   2 if the opening brace following such a <keyword> is
       omitted, the start-column of the next token establishes
       a baseline for automatic semicolon/end brace insertion

   3 following lines beginning with a non-white token that is
       - indented more: continue the preceding statement
       - indented equally: start a new statement in the block
       - indented less: end the block

That is it. HSI has the following useful properties:

   - if programmers use a construct that uses braces and
       semicolons, they don't have to look in the grammar
       for details, they know that HSI will be possible (1)

   - if programmers do not want HSI, all they need to do
       is make their braces and semicolons explicit (2);
       (HSI will not insert additional semicolons if all
           braces are explicit)

   - programmers can use linebreaks to clean up their code;
       the _combination_ of linebreak and relative indentation
       (more/equal/less) of the next line controls HSI (3)

   - only omitted braces and indentation control HSI (2,3);
       in particular, semicolons are not inserted to correct
       errors, and HSI behaves uniformly for all blocks and
       all kinds of statement (again, no need to consult the
       grammar for restricted productions)

   - since semicolon insertion is controlled by indentation,
       not error correction, it does not limit grammar design
       (no new ambiguities due to interaction with HSI)

The differences between the two approaches to semicolon
insertion are substantial (programmer control/systematic
predictability vs grammar author control/memorization
of special cases and attempted error correction).

When I first encountered HSI, I wrote all my {;} explicitly,
because it was always presented as a "layout rule" and I
didn't feel comfortable with that.

Predictability (both in reading and in writing code) and the
reduction of syntax noise soon won me over. Still, it is useful
to have the option of no HSI interference, if one generates
code with a simple tool, if one wants to make all inserted {;}
explicit, or when whitespace is messed with (emails).
Also, explicit and implicit style can be combined.

The system has been working remarkably well in practice
(the main no-nos are mixing tabs and spaces, or tools that
meddle with whitespace). HSI reinforces the common
practice of indenting nested blocks, while ASI seems to
have no such intuitive guidelines.

Many, though not all, examples of programmers getting
into trouble with Javascript's ASI run against indentation
expectations (nesting return value on next line and still
getting ASI, having separate lines with same indent
merged because ASI does not kick in).

I'm throwing this alternative in the ring because I've seen
discussions on seemingly unrelated spec issues where
suddenly, people would say they'd have to check whether
some idea works out with restricted productions (often,
suggested new syntax turns out to be ambiguous when
combined with ASI).

Also, some of the pessimism surrounding ASI reform [3]
stems from the limitations of current spec tools, such as
restricted productions, so looking at other ways to insert
semicolons might help.

It is interesting that even the ES5 spec has no convincing
ASI examples, only clarifying examples (7.9.2). And blog
posts seem to be more about trouble with ASI than about
usefulness of ASI [2,4,5]. So ASI as it stands in Javascript
now does not only make life harder for programmers but
for spec writers (and readers), too.

It would be useful to know examples of ASI working well
for someone. Then one could check whether the benefits
could be achieved by alternate rules, while reducing the
danger that programmer and compiler have different
interpretations of the same code.

HSI would need some tweaking to be suitable for Javascript
coding styles (though some tweaks could be copied from
Haskell, I just omitted them to bring out the core ideas).

Still, such a variant might work better and be easier to
understand than the current ASI. Equally important, a
transition might be doable as incremental improvements
rather than a radically different system. For instance, one
could weaken the no-line-break-here token to consider
line-break plus indentation. One might drop the
error-correction bits if indentation provides alternative
control.

In the spirit of refactoring languages in small steps,
improving ASI might be more manageable than removing
it (and if ASI is worth doing, it is worth doing it well).

Claus

[1]
http://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-210002.7
[2] http://lucumr.pocoo.org/2011/2/6/automatic-semicolon-insertion/
[3]
http://old.nabble.com/Rationalizing-ASI-(was%3A-simple-shorter-function-syntax)-td29256435.html
[4] http://asi.qfox.nl/ (ASI certification)
[5] http://inimino.org/~inimino/blog/javascript_semicolons



_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Automatic Semicolon Insertion: value vs cost; predictability and control; alternatives

Reply via email to