An answer to Peter C. about the longest choice problem.


2011/1/9 Peter Cashin <cashin.pe...@gmail.com>

>
> I took a look at Bran's PEG thesis, and I think that it will all stay good
> if you simply replace the first choice with a longest choice.  Maybe one of
> you, or some other theory expert could comment on that, or maybe its already
> been proved?
>

 During a recent internship I carried through (I'm only a tiny master
student, please be careful with your big feet), I implemented PEGs with
first match - which is the original definition, and with longest match

My guess is that a PEG system based on longest match can generate all the
languages a classical PEG system can, and even more. Some languages were
definitely not recognized by my PEGs, but recognized if parsed with longest
match. Also, some test suite showed no noticeable slowdown with the longest
match rule, even on potentially ambiguous grammars.
However, you can't expect the same parse trees from both systems even if the
languages are recognized by both of them. So I don't think that "it will all
stay good" for everybody if you replace first match by longest match. Yet,
your decomposition

x / y   =>   x | !x y

seems really great to convert a PEG to that "longest match PEG" system.


> It looks as if it will be less efficient, but most of the time the choices
> are mutually exclusive and it fails fast. So a bad grammar runs slower, that
> seems ok to me. Of course there is no reason we can't have both operators.
>
>

> I propose a committed choice grammar, CCG, that includes PEG as a subset.
> The limitation of a CCG is that it can not express ambiguity, no good for
> human languages maybe, but great for computer languages. All the grammar
> expression operators in CCG must be committed choice, by definition. But
> that leaves it wide open: it is orthogonal to the traditional RE/CFG/CSG
> nested grammars. In addition to the PEG grammar expressions CCG can include
> x|y and other operators: a context sensitive operator that can look back in
> the syntax tree seems most useful.
>
> Ah yes, the syntax tree: it seems like using semantic methods is a great
> way to get lots of people using grammar rules for lots of things. I'm all
> for that. But I want the grammar to be a formal specification, like the IETF
> ABNF RFC specs. so I do not want the grammar corrupted with semantic methods
> or any other notations if I can avoid them. Roman follows this route so I
> think thats good. Others like to stick the semantics in the grammar rules,
> and it all becomes programming language specific: so be it, it's still way
> way better that not using a grammar at all.
>
> So this is all good and no explicit syntax tree: but I agree with Peter, I
> need to know the derivation. I can't define a context sensitive grammar
> operator without that, amongst other things.
>
> But the clean semantic methods model gets us a long way, I like that as a
> way to get people using this stuff.
>
> I can do without all but one extra notation on the CCG version of BNF
> rules, I've called that PBNF, Parser-BNF. I need a notation to designate
> rules that represent leaf nodes. Without that the compiler just can't do
> enough... the terminal rules need to be crunched hard. Of course there is
> pressure for few more notations, but one will do.
>
> So now the darn LR: personally I'm with Roman, I don't need it. But of
> course there is all that history and text book examples...  It's just too
> hard to have to explain that PEG/CCG is great, but it can't actually do all
> that text book stuff...  as an aside I just saw the Python grammar: I like
> it, no LR that I saw, a great a breath of fresh air. Of course it also has
> indentation which is context sensitive so it has some special hacks at the
> bottom level to cope with that.
>
> So I think I the gate has to be open. It is too hard to ignore LR even if
> we may never need it. But after thinking about Peters design I am starting
> to think that he may have a recipe that I can understand. That would be a
> great step. I'd be able to write down the derivation for the tongue twisters
> that I never want to see in practice. But implementing it is an open
> question. Its not simple. Its quite difficult to avoid it slowing down
> everything else to a degree (some sort of call stack is probably needed at
> the least). So I'll stick in the home paddock, but I'll tell everyone the
> gate is open, PEG/CCG can do everything you want.... and good luck to Peter,
> keep us up to date.
>
> Cheers,
> Peter.
>
>
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg
>
>
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to