On 05/31/11 14:55, Brendan Eich wrote:
On May 31, 2011, at 2:30 PM, Waldemar Horwat wrote:

I would not want to use anything like a PEG to standardize a grammar.  Here's 
why:

PEG being unambiguous by construction simply means that it resolves all 
ambiguities by picking the earliest rule.  This turns all rules following the 
first one into negative rules:  X matches Z only if it DOESN'T match a Y or a Q 
or a B or ....  You could pick the same strategy to disambiguate an LR(1) 
grammar, and it would be equally bad.

Negative rules are the bane of grammars and behind the majority of the problems 
with the C++ grammar, including the examples I listed earlier.  They make a 
grammar non-understandable because the order of the rules is subtly significant 
and makes it hard to reason about when an X matches a Z; a language extension 
might expand the definition of Y to make an X no longer match a Q, and you 
wouldn't know it just by looking at a grammar with negative rules.  In a 
positive-rule-only grammar you'd discover the problem right away because the 
grammar wouldn't compile.

Thanks -- you've made this point before and I've agreed with it. It helps to restate and 
amplify it, I think, because my impression is that not many people "get it".

PEG users may be happy with their JS parsers at any given point in the 
language's standard version-space, of course.

It still could be that we use LL(1) or another positive-rule-only grammar, of 
course, but we can hash that out separately.


Negative rules also interact badly with both semicolon insertion and 
division-vs-regexp lexer disambiguation.  One might naively think that 
semicolon insertion would be an ideal match for negative rules:  You first try 
to parse

  tokens-on-line1
  tokens-on-line2

as a single statement and, only if that fails, you move on to parsing it as two 
statements with a virtual semicolon between them.  That, however, doesn't work. 
 Here's a simple counterexample:

  a + b
  (c) = d

Negative rules would insert a virtual semicolon here because

  a + b(c) = d

is not a valid parse.  However, the correct ECMAScript behavior is not to 
insert a semicolon.

Heh; this doesn't pass the first rule of ASI fight-club: there's no insertion 
is there is no error.

I don't understand the premise of your comment on ASI.  Here there *is* an 
error in parsing without a virtual semicolon and no error in parsing with a 
virtual semicolon, so a PEG-like ASI would erroneously insert one.

    Waldemar
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to