Re: That hash symbol

Brendan Eich Tue, 29 Mar 2011 16:22:45 -0700

On Mar 29, 2011, at 3:30 PM, Bob Nystrom wrote:

> C#, CoffeeScript, and other languages use -> to link a formal parameter list 
> to a function body, which requires bottom-up parsing in general (with comma 
> as operator, as JS, C++, and C have; plus Harmony's destructuring and default 
> parameter value proposals).
> 
> I'm not a parsing expert, but isn't destructuring just as hard to parse 
> top-down as => for functions would be? Given:
> 
>     { a: b, c: d } =
> 
> A top-down parser will go up to "=" thinking its parsing an object literal. 
> Then it hits the "=" and have to either backtrack, or just transform the 
> object literal AST into a destructuring pattern.


That's exactly what we do in SpiderMonkey, and IIRC Rhino does the same.

My point was about parsing, not parsing + some retrospective procedure on the 
AST that rewrites it. That latter is not just "parsing", it's a separate pass 
and not formalized in the ECMA-262 specs currently. More below.


> Wouldn't => work the same way?
> 
>     (a, b) =>
> 
> It parses "(a, b)" thinking it's a grouped comma operator (not exactly a 
> common expression FWIW), then it hits "=>" realizes it's a function parameter 
> decl, and then either backtracks or just transforms the left-hand AST into a 
> param decl.

It ups the ante beyond "pure parsing", but yes, in the same way as 
destructuring.

One example of the cost of this ante: Harmony wants early error for assignment 
that would create a global variable, or use of an identifier that is not 
declared via a lexical binding form. These would have to come in a later pass, 
or be deferred manually till closing ) without immediately following arrow were 
parsed.

There is no absolute top-down-parsing-must-be-"easy" requirement, and indeed 
the formal grammar is LR(1), so we need to validate each edition that way -- 
via a bottom-up grammar and even automated checker (modulo ASI, which is 
separable and treated separately).

Waldemar has shown how using only top-down parsing, however formalized, without 
bottom-up grammatical validation can lead one astray:

https://mail.mozilla.org/pipermail/es-discuss/2008-October/007883.html

Something like the reverse, bottom-up validation without top-down being "easy 
enough", could also be a problem, since none of the major engines (AFAIK) uses 
bottom-up parsing.


> I understand this list isn't "teach me the details of the JS grammar", but it 
> isn't obvious to me why an infix function syntax is any harder than 
> destructuring as far as parsing performance is concerned.

I agree, and I said so at https://gist.github.com/888867#comments and here on 
the list:

https://mail.mozilla.org/pipermail/es-discuss/2011-March/013462.html

which seems to be right before your post in the archive at 
https://mail.mozilla.org/pipermail/es-discuss/2011-March/thread.html.

So yes, we can certainly consider infix-arrow, but it's more work for top-down 
parser implementors than leading octothorp or equivalent prefix. Destructuring 
requires similar but less work. That may not be enough to justify infix-arrow.

As usual, the big question for the future of the standard language is: do TC39 
members -- in particular the parser implementors at Apple, Google, Microsoft, 
Mozilla, and Opera -- all agree?

We have already approved destructuring for ES.next -- it's in the 
harmony:proposals part of the wiki. But it's not the same in degree of work, 
even if same in kind measured roughly.


> Empirically, I'd expect it to be less of an issue because the comma operator 
> is so rare and parameter declarations tend to be short. Is it because there 
> are things that would be valid in a parameter declaration that are *not* 
> valid expressions?

Possibly, although if we use | to separate the optional formal receiver 
parameter declaration from the positional parameters, then we're still ok: (t = 
u | a, b, c) is a wacky expression: ((t = (u | a)), a, b, c) -- comma with 
assignment of bitwise-or as first comma-linked operand.

Again the detailed cost analysis shows pain due to precedence shifting. 
Rewriting this AST in a top-down parser to have a shape more like ((t = u), a, 
b, c) or, ot label nodes functionally and with quoting elided, 
formals(opt_this_formal(t, u), a, b, c), might engender some strong complaints 
from implementors!

/be


> 
> - bob
>  
> 
> Requiring bottom-up parsing has bounced off of implementors in the past, and 
> with JavaScriptCore switching from a Bison grammar to a top-down hand-coded 
> parser, I expect it will again.
> 
> 
>> I don't find syntax like this clear from a coder's POV, and there is the 
>> re-tooling issue with highlighting editors and the ability to trivially 
>> transform between the styles for faster adoption and old code minification 
>> -- while these issues certainly shouldn't be deciding factors for TC39 it is 
>> nice that leading-char lparen...rparen makes most of them go away.
> 
> That's the idea. We need to keep this simple or it will probably fall apart, 
> either due to ambiguities, or implementors balking at too much complexity in 
> parsing with more power than top-down parsers have.
> 
> /be
> 
> _______________________________________________
> es-discuss mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/es-discuss
> 
> 
> _______________________________________________
> es-discuss mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/es-discuss

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: That hash symbol

Reply via email to