On Saturday, 7 July 2012 at 21:08:43 UTC, Dmitry Olshansky wrote:
You may misunderstood me as well, the point is still the same:
there are 2 things - notation and implementation, the fact that
lexer is integrated in notation like in PEGs is not related to
the fact that PEGs in their classic definition never use term
Token and do backtracking parsing essentially on character
level.
But PEG itself doesn't require backtracking parsing, does it? So
that's an implementation detail, or a specific algorithm. Lexer
separation seems to be independent of this.
As for lexing multiple times, simply use a free list of
terminals (aka tokens). I still assume that grammar is
properly defined, so that there is only one way to split
source into tokens.
Tokens.. there is no such term in use if we talk about 'pure'
PEG.
Terminal symbols.