Hi Roman & Peter: I'd like to be on both sides of this fence!
Luckily I think you've already opened the gate for me... Let me start in Roman's paddock (we have big paddocks in NZ). First, push Left Recursion (LR) through the gate, the home paddock is neat and tidy. But the prefix capture issue, "what this damned thing is doing", is still here. This is one reason that I'm using a longest match choice operator: x|y, rather than the PEG first choice x/y operator. But its a committed choice operator, so it's very PEG like. As an aside: I'd used this for quite a while before Bryan did his PEG thesis, but I like PEG a lot, and it has a theory foundation, so I started using his x/y operator, but I've reverted back. The order of choices does not matter with x|y, and it seems that much easier to use. PEG is a subset since you can write a first choice as: x / y => x | !x y The good news: its a super-set of PEG, the bad news: we still have the "what this damned thing is doing" problem, albeit hidden in the grass... Its the negation that is the powerful devil! That's what gives PEG the power to express languages that are not context-free, not the choice operator. I took a look at Bran's PEG thesis, and I think that it will all stay good if you simply replace the first choice with a longest choice. Maybe one of you, or some other theory expert could comment on that, or maybe its already been proved? It looks as if it will be less efficient, but most of the time the choices are mutually exclusive and it fails fast. So a bad grammar runs slower, that seems ok to me. Of course there is no reason we can't have both operators. I propose a committed choice grammar, CCG, that includes PEG as a subset. The limitation of a CCG is that it can not express ambiguity, no good for human languages maybe, but great for computer languages. All the grammar expression operators in CCG must be committed choice, by definition. But that leaves it wide open: it is orthogonal to the traditional RE/CFG/CSG nested grammars. In addition to the PEG grammar expressions CCG can include x|y and other operators: a context sensitive operator that can look back in the syntax tree seems most useful. Ah yes, the syntax tree: it seems like using semantic methods is a great way to get lots of people using grammar rules for lots of things. I'm all for that. But I want the grammar to be a formal specification, like the IETF ABNF RFC specs. so I do not want the grammar corrupted with semantic methods or any other notations if I can avoid them. Roman follows this route so I think thats good. Others like to stick the semantics in the grammar rules, and it all becomes programming language specific: so be it, it's still way way better that not using a grammar at all. So this is all good and no explicit syntax tree: but I agree with Peter, I need to know the derivation. I can't define a context sensitive grammar operator without that, amongst other things. But the clean semantic methods model gets us a long way, I like that as a way to get people using this stuff. I can do without all but one extra notation on the CCG version of BNF rules, I've called that PBNF, Parser-BNF. I need a notation to designate rules that represent leaf nodes. Without that the compiler just can't do enough... the terminal rules need to be crunched hard. Of course there is pressure for few more notations, but one will do. So now the darn LR: personally I'm with Roman, I don't need it. But of course there is all that history and text book examples... It's just too hard to have to explain that PEG/CCG is great, but it can't actually do all that text book stuff... as an aside I just saw the Python grammar: I like it, no LR that I saw, a great a breath of fresh air. Of course it also has indentation which is context sensitive so it has some special hacks at the bottom level to cope with that. So I think I the gate has to be open. It is too hard to ignore LR even if we may never need it. But after thinking about Peters design I am starting to think that he may have a recipe that I can understand. That would be a great step. I'd be able to write down the derivation for the tongue twisters that I never want to see in practice. But implementing it is an open question. Its not simple. Its quite difficult to avoid it slowing down everything else to a degree (some sort of call stack is probably needed at the least). So I'll stick in the home paddock, but I'll tell everyone the gate is open, PEG/CCG can do everything you want.... and good luck to Peter, keep us up to date. Cheers, Peter.
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg