On 05/31/11 14:55, Brendan Eich wrote:
On May 31, 2011, at 2:30 PM, Waldemar Horwat wrote:
I would not want to use anything like a PEG to standardize a grammar. Here's
why:
PEG being unambiguous by construction simply means that it resolves all
ambiguities by picking the earliest rule. This turns all rules following the
first one into negative rules: X matches Z only if it DOESN'T match a Y or a Q
or a B or .... You could pick the same strategy to disambiguate an LR(1)
grammar, and it would be equally bad.
Negative rules are the bane of grammars and behind the majority of the problems
with the C++ grammar, including the examples I listed earlier. They make a
grammar non-understandable because the order of the rules is subtly significant
and makes it hard to reason about when an X matches a Z; a language extension
might expand the definition of Y to make an X no longer match a Q, and you
wouldn't know it just by looking at a grammar with negative rules. In a
positive-rule-only grammar you'd discover the problem right away because the
grammar wouldn't compile.
Thanks -- you've made this point before and I've agreed with it. It helps to restate and
amplify it, I think, because my impression is that not many people "get it".
PEG users may be happy with their JS parsers at any given point in the
language's standard version-space, of course.
It still could be that we use LL(1) or another positive-rule-only grammar, of
course, but we can hash that out separately.
Negative rules also interact badly with both semicolon insertion and
division-vs-regexp lexer disambiguation. One might naively think that
semicolon insertion would be an ideal match for negative rules: You first try
to parse
tokens-on-line1
tokens-on-line2
as a single statement and, only if that fails, you move on to parsing it as two
statements with a virtual semicolon between them. That, however, doesn't work.
Here's a simple counterexample:
a + b
(c) = d
Negative rules would insert a virtual semicolon here because
a + b(c) = d
is not a valid parse. However, the correct ECMAScript behavior is not to
insert a semicolon.
Heh; this doesn't pass the first rule of ASI fight-club: there's no insertion
is there is no error.
I don't understand the premise of your comment on ASI. Here there *is* an
error in parsing without a virtual semicolon and no error in parsing with a
virtual semicolon, so a PEG-like ASI would erroneously insert one.
Waldemar
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss