On Mar 29, 2011, at 3:30 PM, Bob Nystrom wrote:
> C#, CoffeeScript, and other languages use -> to link a formal parameter list
> to a function body, which requires bottom-up parsing in general (with comma
> as operator, as JS, C++, and C have; plus Harmony's destructuring and default
> parameter value proposals).
>
> I'm not a parsing expert, but isn't destructuring just as hard to parse
> top-down as => for functions would be? Given:
>
> { a: b, c: d } =
>
> A top-down parser will go up to "=" thinking its parsing an object literal.
> Then it hits the "=" and have to either backtrack, or just transform the
> object literal AST into a destructuring pattern.
That's exactly what we do in SpiderMonkey, and IIRC Rhino does the same.
My point was about parsing, not parsing + some retrospective procedure on the
AST that rewrites it. That latter is not just "parsing", it's a separate pass
and not formalized in the ECMA-262 specs currently. More below.
> Wouldn't => work the same way?
>
> (a, b) =>
>
> It parses "(a, b)" thinking it's a grouped comma operator (not exactly a
> common expression FWIW), then it hits "=>" realizes it's a function parameter
> decl, and then either backtracks or just transforms the left-hand AST into a
> param decl.
It ups the ante beyond "pure parsing", but yes, in the same way as
destructuring.
One example of the cost of this ante: Harmony wants early error for assignment
that would create a global variable, or use of an identifier that is not
declared via a lexical binding form. These would have to come in a later pass,
or be deferred manually till closing ) without immediately following arrow were
parsed.
There is no absolute top-down-parsing-must-be-"easy" requirement, and indeed
the formal grammar is LR(1), so we need to validate each edition that way --
via a bottom-up grammar and even automated checker (modulo ASI, which is
separable and treated separately).
Waldemar has shown how using only top-down parsing, however formalized, without
bottom-up grammatical validation can lead one astray:
https://mail.mozilla.org/pipermail/es-discuss/2008-October/007883.html
Something like the reverse, bottom-up validation without top-down being "easy
enough", could also be a problem, since none of the major engines (AFAIK) uses
bottom-up parsing.
> I understand this list isn't "teach me the details of the JS grammar", but it
> isn't obvious to me why an infix function syntax is any harder than
> destructuring as far as parsing performance is concerned.
I agree, and I said so at https://gist.github.com/888867#comments and here on
the list:
https://mail.mozilla.org/pipermail/es-discuss/2011-March/013462.html
which seems to be right before your post in the archive at
https://mail.mozilla.org/pipermail/es-discuss/2011-March/thread.html.
So yes, we can certainly consider infix-arrow, but it's more work for top-down
parser implementors than leading octothorp or equivalent prefix. Destructuring
requires similar but less work. That may not be enough to justify infix-arrow.
As usual, the big question for the future of the standard language is: do TC39
members -- in particular the parser implementors at Apple, Google, Microsoft,
Mozilla, and Opera -- all agree?
We have already approved destructuring for ES.next -- it's in the
harmony:proposals part of the wiki. But it's not the same in degree of work,
even if same in kind measured roughly.
> Empirically, I'd expect it to be less of an issue because the comma operator
> is so rare and parameter declarations tend to be short. Is it because there
> are things that would be valid in a parameter declaration that are *not*
> valid expressions?
Possibly, although if we use | to separate the optional formal receiver
parameter declaration from the positional parameters, then we're still ok: (t =
u | a, b, c) is a wacky expression: ((t = (u | a)), a, b, c) -- comma with
assignment of bitwise-or as first comma-linked operand.
Again the detailed cost analysis shows pain due to precedence shifting.
Rewriting this AST in a top-down parser to have a shape more like ((t = u), a,
b, c) or, ot label nodes functionally and with quoting elided,
formals(opt_this_formal(t, u), a, b, c), might engender some strong complaints
from implementors!
/be
>
> - bob
>
>
> Requiring bottom-up parsing has bounced off of implementors in the past, and
> with JavaScriptCore switching from a Bison grammar to a top-down hand-coded
> parser, I expect it will again.
>
>
>> I don't find syntax like this clear from a coder's POV, and there is the
>> re-tooling issue with highlighting editors and the ability to trivially
>> transform between the styles for faster adoption and old code minification
>> -- while these issues certainly shouldn't be deciding factors for TC39 it is
>> nice that leading-char lparen...rparen makes most of them go away.
>
> That's the idea. We need to keep this simple or it will probably fall apart,
> either due to ambiguities, or implementors balking at too much complexity in
> parsing with more power than top-down parsers have.
>
> /be
>
> _______________________________________________
> es-discuss mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/es-discuss
>
>
> _______________________________________________
> es-discuss mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/es-discuss
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss