On 07-Jul-12 22:23, Andrei Alexandrescu wrote:
On 7/7/12 6:24 AM, Roman D. Boiko wrote:
On Saturday, 7 July 2012 at 09:06:57 UTC, Roman D. Boiko wrote:
http://stackoverflow.com/questions/11373644/performance-of-parsers-peg-vs-lalr1-or-llk
So far it looks like LALR parsers may have lower constant factors than
Packrat.
The difference could be minimized by paying attention to parsing of
terminal symbols, which was in my plans already. It is not necessary to
strictly follow Packrat parsing algorithm.
The benefits of Pegged, in my view, are its support of Parsing
Expression Grammar (PEG) and compile-time evaluation. It is easily
extensible and modifiable.
Isn't also the fact that lexing and parsing are integrated? With
traditional LALR you need a separate tokenizer.
I'll have to point out that the whole point about integrated lexing is
moot. It's more of liability then benefit. At very least it's just
implementation curiosity not advantage.
When I implemented recursive-descent parser by hand in one of early
drafts of DCT, I strongly felt the need to generalize code in a way
which in retrospect I would call PEG-like. The structure of my
hand-written recursive-descent parser was a one-to-one mapping to an
implemented subset of D specification, and I considered it problematic,
because it was needed to duplicate the same structure in the resulting
AST.
PEG is basically a language that describes both, the implementation of
parser, and the language syntax. It greatly reduces implicit code
duplication.
I think that generated code can be made almost as fast as a hand-written
parser for a particular language (probably, a few percent slower).
Especially if that language is similar to D (context-free, with
fine-grained hierarchical grammar). Optimizations might require to
forget about strictly following any theoretical algorithm, but that
should be OK.
All that sounds really encouraging. I'm really looking forward to more
work in that area. If you stumble upon bugs that block you, let us know
and Walter agreed he'll boost their priority.
Andrei
--
Dmitry Olshansky