Hi,

I've posted my final patch to adapt the re module to the py3k standards of
bytes/unicode separation.

Here is a short summary of the changes:
- mixing bytes and str patterns, search and replacement strings raises a
TypeError
- re.UNICODE and (?u) become almost no-ops: they are the default for unicode
strings, and forbidden for bytes strings
- re.ASCII and (?a) are introduced: for unicode strings, they specify to do
old-style ASCII matching (for example, \d will only match [0-9] rather than all
ranges of unicode decimal digits); for bytes strings, they are the only
available behaviour
- mixing re.UNICODE and re.ASCII is forbidden
- the stdlib is adapted so that (hopefully) all places which rely on ASCII
matching of unicode patterns don't break

>From the above description you might infer that we should deprecate re.UNICODE
and (?u). It's a possible decision but I think we should leave that to a later
patch. The status of re.LOCALE is another issue again.

The issue is at http://bugs.python.org/issue2834
and the patch can be reviewed at http://codereview.appspot.com/2439

Thanks

Antoine.


_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to