Matthew Barnett <pyt...@mrabarnett.plus.com> added the comment: issue2636-20090726.zip is a new implementation of the re engine. It replaces re.py, sre.py, sre_constants.py, sre_parse.py and sre_compile.py with a new re.py and replaces sre_constants.h, sre.h and _sre.c with _re.h and _re.c.
The internal engine no longer interprets a form of bytecode but instead follows a linked set of nodes, and it can work breadth-wise as well as depth-first, which makes it perform much better when faced with one of those 'pathological' regexes. It supports scoped flags, variable-length lookbehind, Unicode properties, named characters, atomic groups, possessive quantifiers, and will handle zero-width splits correctly when the ZEROWIDTH flag is set. There are a few more things to add, like allowing indexing for capture groups, and further speed improvements might be possible (at worst it's roughly the same speed as the existing re module). I'll be adding some documentation about how it works and the slight differences in behaviour later. ---------- Added file: http://bugs.python.org/file14570/issue2636-20090726.zip _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2636> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com