On 2/7/2018 1:07 PM, Nathan S. wrote:
On Tuesday, 6 February 2018 at 22:29:07 UTC, Walter Bright wrote:
nobody uses regex for lexer in a compiler.

Some years ago I was surprised when I saw this in Clojure's source code. It appears to still be there today:


static Pattern symbolPat = Pattern.compile("[:]?([\\D&&[^/]].*/)?(/|[\\D&&[^/]][^/]*)"); //static Pattern varPat = Pattern.compile("([\\D&&[^:\\.]][^:\\.]*):([\\D&&[^:\\.]][^:\\.]*)");
//static Pattern intPat = Pattern.compile("[-+]?[0-9]+\\.?");
static Pattern intPat =
static Pattern ratioPat = Pattern.compile("([-+]?[0-9]+)/([0-9]+)");
static Pattern floatPat = Pattern.compile("([-+]?[0-9]+(\\.[0-9]*)?([eE][-+]?[0-9]+)?)(M)?");

Yes, I'm sure somebody does it. And now that regex has produced a match, you have to scan it again to turn it into a number, making for slow lexing. And if regex doesn't produce a match, you get a generic error message rather than something specific like "character 'A' is not allowed in a numeric literal".

(Generic error messages are one of the downsides of using tools like lex and 

