On Wed, Oct 24, 2001 at 09:06:14AM -0400, Aaron Sherman wrote:
On Tue, Oct 23, 2001 at 02:53:19PM +0200, Nadim Khemir wrote:
Don't we already have that in Perl 5?
if ( /\G\s+/gc ) {# whitespaces }
elsif ( /\G[*/+-]/gc ) { # operator }
elsif ( /\G\d+/gc ) {# term }
elsif ( /\G.+/gc ) { # unrecognized token }
Tad McClellan
The answer is NO, regexes and a lexer are totally different. I would
recommend Tad to study a bit more what parsing is before thinking it's jut
about writing regexes. Having a lexer allows perl do some kind of text
processing (raw lexing and parsing) at a much faster. If it is of some
interest I could benchmark a simple example.
So, aren't you saying, yes, but it would be slow? I can't think of
anything a lexer is capable of that I can't (and probably haven't) done
in Perl with relative ease.
Now, if you want a PARSER, that's a different matter, but a simple
lexical scanner is trivial to write in Perl with logic and regular
expressions.
In terms of speed, this is particularly ideal because you can identify
what parts of your Perl code slow the lexer down, and re-code those
using C/XS. The best of all 2,384 worlds... that's Perl!
I have always found that the perl output from byacc (with a few tweaks)
generates a sufficient parser. The addition of a switch statement
will hopefully make it more efficient.
For a lexer I try to use a single regex with /g, but that does require the
text being parsed to be all in a single scalar. Although that could be
worked around if needed.
For an example, take a look at Convert::ASN1
Graham.