Hi Roman & Peter:

I'd like to be on both sides of this fence!

Luckily I think you've already opened the gate for me...

Let me start in Roman's paddock (we have big paddocks in NZ).

First, push Left Recursion (LR) through the gate, the home paddock is neat
and tidy.

But the prefix capture issue, "what this damned thing is doing", is still
here.

This is one reason that I'm using a longest match choice operator: x|y,
rather than the PEG first choice x/y operator.

But its a committed choice operator, so it's very PEG like. As an aside: I'd
used this for quite a while before Bryan did his PEG thesis, but I like PEG
a lot, and it has a theory foundation, so I started using his x/y operator,
but I've reverted back.

The order of choices does not matter with x|y, and it seems that much easier
to use. PEG is a subset since you can write a first choice as:

x / y   =>   x | !x y

The good news: its a super-set of PEG, the bad news: we still have the "what
this damned thing is doing" problem, albeit hidden in the grass...  Its the
negation that is the powerful devil!  That's what gives PEG the power to
express languages that are not context-free, not the choice operator.

I took a look at Bran's PEG thesis, and I think that it will all stay good
if you simply replace the first choice with a longest choice.  Maybe one of
you, or some other theory expert could comment on that, or maybe its already
been proved?

It looks as if it will be less efficient, but most of the time the choices
are mutually exclusive and it fails fast. So a bad grammar runs slower, that
seems ok to me. Of course there is no reason we can't have both operators.

I propose a committed choice grammar, CCG, that includes PEG as a subset.
The limitation of a CCG is that it can not express ambiguity, no good for
human languages maybe, but great for computer languages. All the grammar
expression operators in CCG must be committed choice, by definition. But
that leaves it wide open: it is orthogonal to the traditional RE/CFG/CSG
nested grammars. In addition to the PEG grammar expressions CCG can include
x|y and other operators: a context sensitive operator that can look back in
the syntax tree seems most useful.

Ah yes, the syntax tree: it seems like using semantic methods is a great way
to get lots of people using grammar rules for lots of things. I'm all for
that. But I want the grammar to be a formal specification, like the IETF
ABNF RFC specs. so I do not want the grammar corrupted with semantic methods
or any other notations if I can avoid them. Roman follows this route so I
think thats good. Others like to stick the semantics in the grammar rules,
and it all becomes programming language specific: so be it, it's still way
way better that not using a grammar at all.

So this is all good and no explicit syntax tree: but I agree with Peter, I
need to know the derivation. I can't define a context sensitive grammar
operator without that, amongst other things.

But the clean semantic methods model gets us a long way, I like that as a
way to get people using this stuff.

I can do without all but one extra notation on the CCG version of BNF rules,
I've called that PBNF, Parser-BNF. I need a notation to designate rules that
represent leaf nodes. Without that the compiler just can't do enough... the
terminal rules need to be crunched hard. Of course there is pressure for few
more notations, but one will do.

So now the darn LR: personally I'm with Roman, I don't need it. But of
course there is all that history and text book examples...  It's just too
hard to have to explain that PEG/CCG is great, but it can't actually do all
that text book stuff...  as an aside I just saw the Python grammar: I like
it, no LR that I saw, a great a breath of fresh air. Of course it also has
indentation which is context sensitive so it has some special hacks at the
bottom level to cope with that.

So I think I the gate has to be open. It is too hard to ignore LR even if we
may never need it. But after thinking about Peters design I am starting to
think that he may have a recipe that I can understand. That would be a great
step. I'd be able to write down the derivation for the tongue twisters that
I never want to see in practice. But implementing it is an open question.
Its not simple. Its quite difficult to avoid it slowing down everything else
to a degree (some sort of call stack is probably needed at the least). So
I'll stick in the home paddock, but I'll tell everyone the gate is open,
PEG/CCG can do everything you want.... and good luck to Peter, keep us up to
date.

Cheers,
Peter.
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to