On 11/12/2011 20:27, Guido van Rossum wrote:
On Sun, Dec 11, 2011 at 12:12 PM, MRAB<pyt...@mrabarnett.plus.com>
wrote:
I've just come across an omission in re.sub which I hadn't noticed
before.

In re.sub the replacement string can contain escape sequences, for
example:

repr(re.sub(r"x", r"\n", "axb"))
"'a\\nb'"

However:

repr(re.sub(r"x", r"\x0A", "axb"))
"'a\\\\x0Ab'"

Yes, it doesn't recognise "\xNN".

Is there a reason for this?

The regex module does the same, but is there any objection to me
fixing it in the regex module? (I'm thinking about compatibility
with re here.)

As long as there's a way to place a single backslash in the output
this seems fine to me, though I'm not sure it's important. Of course
it will likely break some test... the test will then have to be
fixed.

I can't remember why we did this -- is there a full list of all the
escapes that re.sub() interprets somewhere? I thought it was pretty
limited. Maybe it's the related list of escapes that are supported
in regular expressions?

The documentation says: """That is, \n is converted to a single newline character, \r is converted to a linefeed, and so forth."""

All of the other escape sequences work as expected, except for \uNNNN
and \UNNNNNNNN which aren't supported at all in re.

I should probably also add \N{...} to the list for completeness.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to