Re: Let's stop parser Hell

Dmitry Olshansky Sat, 07 Jul 2012 13:30:27 -0700

On 08-Jul-12 00:09, Andrei Alexandrescu wrote:

On 7/7/12 3:17 PM, Dmitry Olshansky wrote:

I'll have to point out that the whole point about integrated lexing is
moot. It's more of liability then benefit. At very least it's just
implementation curiosity not advantage.


Interesting. I'll have to ask for more details on why.


I think I told that before but anyway the main point is:
the reason lexer exists is performance.

It's obvious that one can write grammar without ever using lexer. If thenotation is good say regex or the one used in PEG (that is not quite thesame it turns out) then token can be easily describe in grammar. Once wegot lexer can give only 2 things:-manageability, it's jsut splitting grammar in 2 parts that could bemaintained separately (in fact some "lexers" are not DFA nor regular)(sometimes it could go the opposite direction - it could be better tohave one integrated grammar)- speed. The reason it could be faster is because lexer can use highlydeterministic grammar.

Now back to PEGs and packrats. What I see everywhere I look isessentially integration of a backtracking-NFA lexer, that is not fastnor is particularly convenient. The notation is the whole other matter.

So my view on the matter: whether or not lexer is integrated is two-foldquestion: notation vs implementation.

E.g. it may make regular lexer-like things a part of notation thushaving rules like:

        Identifier  -> Alpha (Alpha|Number|_)*

Yet it may or may use the same engine for _all_ _rules_, in fact ifimplementation optimize things by ripping off regular parts of grammarto small DFA engines it would make some kind of lexer behind the scenes!

Anyway highly hybrid parsers are all the rage now, and reasons areobvious - they runs fast and can be bend by some magic to quiteconvoluted grammars of modern languages ;)


--
Dmitry Olshansky

Re: Let's stop parser Hell

Reply via email to