Re: Composition of regexps (Was re: [9fans] regular expressions in plan9 different from the ones in unix?)

Russ Cox Fri, 23 Feb 2007 09:41:24 -0800

Lex has three benefits:

1) You don't have to write the lexer directly.
2) What you do have to write is fairly concise.
3) The resulting lexer is fairly efficient.


It has two main drawbacks:

4) The input model does not always match your
own program's input model, creating a messy interface.
5) Once you need more than regular expressions,
lexers written with state variables and such can get
very opaque very fast.

Many on this list would argue that (1) and (2) do not
outweigh (4) and (5), instead suggesting that writing a
lexer by hand is not too difficult and ends up being
more maintainable than a lex spec in the long run.
And of course, for a well-written by-hand lexer,
you get to keep (3).

Creating new entry hooks in the regexp library doesn't
preserve (1), (2), or (3).  And if much of your time is
spent in lexical analysis (as Ken claimed was true for
the Plan 9 compilers), losing (3) is a big deal.
So that seems like not a very good replacement for lex.

All that said, lex has been used to write a lot of C
compilers, and can be used in that context without
running into much of (4) or (5).  Why not just use lex here?

Russ

Re: Composition of regexps (Was re: [9fans] regular expressions in plan9 different from the ones in unix?)

Reply via email to