The lexing step is usually sufficiently described by a list of (pattern, 
action) pairs (maybe with priorization)
    
    
    re"if\b": keywordIf
    re"[A-Za-z_]+[A-Za-z_0-9]*": identifier
    re"\d+": number
    
    
    
    Run

So it's not the regexes per se that are the problem. The problem comes from the 
fact that the typical regex library has no support for this setup at all and 
instead makes you write `if|[A-Za-z_]+[A-Za-z_0-9]*|\d+` with no support to 
distinguish between the different cases.

A different but related problem is that regexes are not used only for the 
lexing step but also to directly extract data and skipping the parsing step 
which is terrible for correctness.

Reply via email to