New submission from Serhiy Storchaka:

Regular expressions use the backslash character for two functions:
1) to indicate special forms;
2) to allow special characters to be used without invoking their special 
meaning.

If backslash + character is not recognized as special form (1), it interpreted 
in meaning (2).

Usually new special forms have form backslash + ASCII letter, because unlike to 
other characters single ASCII letters do not have special meaning in any 
regular expression engine or programming language. This using the backslash 
with inner ASCII letter dangerous. Currently it means just this letter 
literally, but in future it can mean special form. For example \u and \U forms 
were added in 3.3 and this could break regular expression patters that use \u 
and \U before.

To avoid possible breaking it makes sense to reject unrecognized backslash + 
ASCII letter sequences. Proposed patch adds deprecation warnings when unknown 
escape of ASCII letter is used. The idea was proposed by Matthew Barnett [1].

[1] http://permalink.gmane.org/gmane.comp.python.devel/151657

----------
assignee: serhiy.storchaka
components: Library (Lib), Regular Expressions
files: re_deprecate_escaped_letters.patch
keywords: patch
messages: 237645
nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Deprecate unrecognized backslash+letter escapes
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file38406/re_deprecate_escaped_letters.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23622>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to