Peter,

Thanks for the great thoughts. I've had a lot of difficulty
implementing my solution, but I think I have it working. Suffice to
say, it's a massive memory hog, but interesting. You can find the
current code in the following Subversion repository:
http://projects.ioreader.com/cp. If anyone is inclined to compile it
and it compiles, it should be noted that that way I buffer characters
is probably not all that portable. I can't imagine that looking
through the code will be particularly enlightening, but if anyone can
get it to run and feels like fooling around with it, then I would
appreciate feedback.

I can imagine that it might be possible to parse according to longest
choice with the ! operator by using a modified LR parser. The states
of an LR parser represent LR items (positions into a given
production). I could imagine then that any use of ! would be ignored
by the parser as adding a state and instead it would be added as a
sort of condition action to the LR item representing whatever symbols
follow the operand of !. Running such a parser would require either
recursively invoking the parser, or having some sort of break/fail
points built into the PDA-like structure that manages the state, but I
think it could be done rather nicely.

Best Regards,

Peter Goodman,
http://ioreader.com
70 Winston Circle,
Montreal, Quebec
H9S 4X6



On Sat, Jan 8, 2011 at 11:07 PM, Peter Cashin <cashin.pe...@gmail.com> wrote:
> Hi Roman & Peter:
> I'd like to be on both sides of this fence!
> Luckily I think you've already opened the gate for me...
> Let me start in Roman's paddock (we have big paddocks in NZ).
> First, push Left Recursion (LR) through the gate, the home paddock is neat
> and tidy.
> But the prefix capture issue, "what this damned thing is doing", is still
> here.
> This is one reason that I'm using a longest match choice operator: x|y,
> rather than the PEG first choice x/y operator.
> But its a committed choice operator, so it's very PEG like. As an aside: I'd
> used this for quite a while before Bryan did his PEG thesis, but I like PEG
> a lot, and it has a theory foundation, so I started using his x/y operator,
> but I've reverted back.
> The order of choices does not matter with x|y, and it seems that much easier
> to use. PEG is a subset since you can write a first choice as:
> x / y   =>   x | !x y
> The good news: its a super-set of PEG, the bad news: we still have the "what
> this damned thing is doing" problem, albeit hidden in the grass...  Its the
> negation that is the powerful devil!  That's what gives PEG the power to
> express languages that are not context-free, not the choice operator.
> I took a look at Bran's PEG thesis, and I think that it will all stay good
> if you simply replace the first choice with a longest choice.  Maybe one of
> you, or some other theory expert could comment on that, or maybe its already
> been proved?
> It looks as if it will be less efficient, but most of the time the choices
> are mutually exclusive and it fails fast. So a bad grammar runs slower, that
> seems ok to me. Of course there is no reason we can't have both operators.
> I propose a committed choice grammar, CCG, that includes PEG as a subset.
> The limitation of a CCG is that it can not express ambiguity, no good for
> human languages maybe, but great for computer languages. All the grammar
> expression operators in CCG must be committed choice, by definition. But
> that leaves it wide open: it is orthogonal to the traditional RE/CFG/CSG
> nested grammars. In addition to the PEG grammar expressions CCG can include
> x|y and other operators: a context sensitive operator that can look back in
> the syntax tree seems most useful.
> Ah yes, the syntax tree: it seems like using semantic methods is a great way
> to get lots of people using grammar rules for lots of things. I'm all for
> that. But I want the grammar to be a formal specification, like the IETF
> ABNF RFC specs. so I do not want the grammar corrupted with semantic methods
> or any other notations if I can avoid them. Roman follows this route so I
> think thats good. Others like to stick the semantics in the grammar rules,
> and it all becomes programming language specific: so be it, it's still way
> way better that not using a grammar at all.
> So this is all good and no explicit syntax tree: but I agree with Peter, I
> need to know the derivation. I can't define a context sensitive grammar
> operator without that, amongst other things.
> But the clean semantic methods model gets us a long way, I like that as a
> way to get people using this stuff.
> I can do without all but one extra notation on the CCG version of BNF rules,
> I've called that PBNF, Parser-BNF. I need a notation to designate rules that
> represent leaf nodes. Without that the compiler just can't do enough... the
> terminal rules need to be crunched hard. Of course there is pressure for few
> more notations, but one will do.
> So now the darn LR: personally I'm with Roman, I don't need it. But of
> course there is all that history and text book examples...  It's just too
> hard to have to explain that PEG/CCG is great, but it can't actually do all
> that text book stuff...  as an aside I just saw the Python grammar: I like
> it, no LR that I saw, a great a breath of fresh air. Of course it also has
> indentation which is context sensitive so it has some special hacks at the
> bottom level to cope with that.
> So I think I the gate has to be open. It is too hard to ignore LR even if we
> may never need it. But after thinking about Peters design I am starting to
> think that he may have a recipe that I can understand. That would be a great
> step. I'd be able to write down the derivation for the tongue twisters that
> I never want to see in practice. But implementing it is an open question.
> Its not simple. Its quite difficult to avoid it slowing down everything else
> to a degree (some sort of call stack is probably needed at the least). So
> I'll stick in the home paddock, but I'll tell everyone the gate is open,
> PEG/CCG can do everything you want.... and good luck to Peter, keep us up to
> date.
> Cheers,
> Peter.
>
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg
>
>

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to