On Sat, 04 Jul 2009 16:35:08 -0700, John Nagle wrote: > The temptation is to write tokenizers in C, but that's an admission > of language design failure.
The only part that really needs to be written in C is the DFA loop. The code to construct the state table from regexps could be written entirely in Python, but I don't see any advantage to doing so. -- http://mail.python.org/mailman/listinfo/python-list