Peter, Thanks for the great thoughts. I've had a lot of difficulty implementing my solution, but I think I have it working. Suffice to say, it's a massive memory hog, but interesting. You can find the current code in the following Subversion repository: http://projects.ioreader.com/cp. If anyone is inclined to compile it and it compiles, it should be noted that that way I buffer characters is probably not all that portable. I can't imagine that looking through the code will be particularly enlightening, but if anyone can get it to run and feels like fooling around with it, then I would appreciate feedback.
I can imagine that it might be possible to parse according to longest choice with the ! operator by using a modified LR parser. The states of an LR parser represent LR items (positions into a given production). I could imagine then that any use of ! would be ignored by the parser as adding a state and instead it would be added as a sort of condition action to the LR item representing whatever symbols follow the operand of !. Running such a parser would require either recursively invoking the parser, or having some sort of break/fail points built into the PDA-like structure that manages the state, but I think it could be done rather nicely. Best Regards, Peter Goodman, http://ioreader.com 70 Winston Circle, Montreal, Quebec H9S 4X6 On Sat, Jan 8, 2011 at 11:07 PM, Peter Cashin <cashin.pe...@gmail.com> wrote: > Hi Roman & Peter: > I'd like to be on both sides of this fence! > Luckily I think you've already opened the gate for me... > Let me start in Roman's paddock (we have big paddocks in NZ). > First, push Left Recursion (LR) through the gate, the home paddock is neat > and tidy. > But the prefix capture issue, "what this damned thing is doing", is still > here. > This is one reason that I'm using a longest match choice operator: x|y, > rather than the PEG first choice x/y operator. > But its a committed choice operator, so it's very PEG like. As an aside: I'd > used this for quite a while before Bryan did his PEG thesis, but I like PEG > a lot, and it has a theory foundation, so I started using his x/y operator, > but I've reverted back. > The order of choices does not matter with x|y, and it seems that much easier > to use. PEG is a subset since you can write a first choice as: > x / y => x | !x y > The good news: its a super-set of PEG, the bad news: we still have the "what > this damned thing is doing" problem, albeit hidden in the grass... Its the > negation that is the powerful devil! That's what gives PEG the power to > express languages that are not context-free, not the choice operator. > I took a look at Bran's PEG thesis, and I think that it will all stay good > if you simply replace the first choice with a longest choice. Maybe one of > you, or some other theory expert could comment on that, or maybe its already > been proved? > It looks as if it will be less efficient, but most of the time the choices > are mutually exclusive and it fails fast. So a bad grammar runs slower, that > seems ok to me. Of course there is no reason we can't have both operators. > I propose a committed choice grammar, CCG, that includes PEG as a subset. > The limitation of a CCG is that it can not express ambiguity, no good for > human languages maybe, but great for computer languages. All the grammar > expression operators in CCG must be committed choice, by definition. But > that leaves it wide open: it is orthogonal to the traditional RE/CFG/CSG > nested grammars. In addition to the PEG grammar expressions CCG can include > x|y and other operators: a context sensitive operator that can look back in > the syntax tree seems most useful. > Ah yes, the syntax tree: it seems like using semantic methods is a great way > to get lots of people using grammar rules for lots of things. I'm all for > that. But I want the grammar to be a formal specification, like the IETF > ABNF RFC specs. so I do not want the grammar corrupted with semantic methods > or any other notations if I can avoid them. Roman follows this route so I > think thats good. Others like to stick the semantics in the grammar rules, > and it all becomes programming language specific: so be it, it's still way > way better that not using a grammar at all. > So this is all good and no explicit syntax tree: but I agree with Peter, I > need to know the derivation. I can't define a context sensitive grammar > operator without that, amongst other things. > But the clean semantic methods model gets us a long way, I like that as a > way to get people using this stuff. > I can do without all but one extra notation on the CCG version of BNF rules, > I've called that PBNF, Parser-BNF. I need a notation to designate rules that > represent leaf nodes. Without that the compiler just can't do enough... the > terminal rules need to be crunched hard. Of course there is pressure for few > more notations, but one will do. > So now the darn LR: personally I'm with Roman, I don't need it. But of > course there is all that history and text book examples... It's just too > hard to have to explain that PEG/CCG is great, but it can't actually do all > that text book stuff... as an aside I just saw the Python grammar: I like > it, no LR that I saw, a great a breath of fresh air. Of course it also has > indentation which is context sensitive so it has some special hacks at the > bottom level to cope with that. > So I think I the gate has to be open. It is too hard to ignore LR even if we > may never need it. But after thinking about Peters design I am starting to > think that he may have a recipe that I can understand. That would be a great > step. I'd be able to write down the derivation for the tongue twisters that > I never want to see in practice. But implementing it is an open question. > Its not simple. Its quite difficult to avoid it slowing down everything else > to a degree (some sort of call stack is probably needed at the least). So > I'll stick in the home paddock, but I'll tell everyone the gate is open, > PEG/CCG can do everything you want.... and good luck to Peter, keep us up to > date. > Cheers, > Peter. > > _______________________________________________ > PEG mailing list > PEG@lists.csail.mit.edu > https://lists.csail.mit.edu/mailman/listinfo/peg > > _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg