Re: [PEG] Bounds for PEG for a language known to be LL(1)

Sérgio Medeiros Mon, 13 May 2019 05:00:49 -0700

On Sat, May 11, 2019 at 6:09 PM Nicolas Laurent <
nicolas.laur...@uclouvain.be> wrote:
> Without memoization, a LL(1) language trivially admits an O(n) PEG
grammar, which should generally look very much like the LL(1) grammar (I
think they might even be provably identical, but I haven't thought this
through, so don't quote me on this).


We have discussed (and proved) this correspondence here:
https://www.sciencedirect.com/science/article/pii/S0167642314000276
https://arxiv.org/abs/1304.3177

> - There exists a set of problematic grammar patterns that have
non-exponential complexity but are very slow nonetheless. These all have to
do with parsing things that look like binary operators. Things are even
worse when some kind of left-recursion support is present. Without entering
the details (which I can supply if requested), you get into a bad
backtracking pattern and you incur things like O(2^(num of operators)) to
parse a "primitive expression". So to ground this, in Java this is ~ 2^10
parser invocation to parse a mere integer. It's easy to fall into these bad
patterns, because some of them are the straight transposition of how
operators are handled in CFGs. My own solution is to supply explicit parser
combinators to enable parsing left- / right-associative expressions.
>
> What I'm getting at here is that it seems that if you map out the
potential pitfalls that give bad performance and supply tools to replace
them, the risk of accidentally incurring catastrophic performance reduces a
lot.
> Static (and potentially heuristic) analysis of grammars could also help a
lot here.

I think it would be valuable if you could present in more detail which are
these problematic patterns,
this would help users of PEG-based tools to avoid them.

--
Sérgio

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Bounds for PEG for a language known to be LL(1)

Reply via email to