The lexing step is usually sufficiently described by a list of (pattern, action) pairs (maybe with priorization) re"if\b": keywordIf re"[A-Za-z_]+[A-Za-z_0-9]*": identifier re"\d+": number Run
So it's not the regexes per se that are the problem. The problem comes from the fact that the typical regex library has no support for this setup at all and instead makes you write `if|[A-Za-z_]+[A-Za-z_0-9]*|\d+` with no support to distinguish between the different cases. A different but related problem is that regexes are not used only for the lexing step but also to directly extract data and skipping the parsing step which is terrible for correctness.