[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset 95fc8e687c487ecf97f4b1b98dfc0c05e3c9cbff by Serhiy Storchaka in 
branch '3.7':
[3.7] bpo-28450: Fix and improve the documentation for unknown escapes in RE. 
(GH-11920). (GH-12029)
https://github.com/python/cpython/commit/95fc8e687c487ecf97f4b1b98dfc0c05e3c9cbff


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
pull_requests: +12060

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-25 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset a180b007d96fe68b32f11dec720fbd0cd5b6758a by Serhiy Storchaka in 
branch 'master':
bpo-28450: Fix and improve the documentation for unknown escapes in RE. 
(GH-11920)
https://github.com/python/cpython/commit/a180b007d96fe68b32f11dec720fbd0cd5b6758a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2019-02-18 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +11945
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2017-11-16 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Barry, could you please improve the documentation about unknown escape 
sequences in regular expressions? My skills is not enough for this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-12-06 Thread Ned Deily

Ned Deily added the comment:

Note that 1b162d6e3d01 in Issue27030 (for 3.6.0rc1) has changed the behavior 
for re.sub replacement templates to produce a deprecation warning in 3.6 while 
still being treated as an error in 3.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think we should discuss this on Python-Dev.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-28 Thread Ned Deily

Ned Deily added the comment:

Where do we stand on this issue?  At the moment, 3.6.0 is on track to be 
released as is.

--
nosy: +ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Emanuel Barry

Changes by Emanuel Barry :


--
nosy: +ebarry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Barry A. Warsaw

Barry A. Warsaw added the comment:

On Nov 22, 2016, at 07:28 PM, Serhiy Storchaka wrote:

>The reason for disallowing some undefined escapes is the same as in pattern
>strings: this would allow as to introduce new special escape sequences.

I'll note that technically speaking, you can still introduce new escapes for
repl without breaking the documented contract.  All the docs say are that
"unknown escapes such as \& are left alone", but that doesn't list what are
unknown escapes.  So if new escapes are added in Python 3.7, and they are
transformed in repl, that would be allowed.

I'll also note that not *all* unknown sequences are rejected now, only
backslashes followed by an ASCII letter.  So \& is still probably left alone,
while \s is now rejected.  That does add to the confusion, although the
deprecation note in the re.sub() documentation does document the new behavior
correctly.

On Nov 22, 2016, at 07:55 PM, R. David Murray wrote:

>There is still the argument that we shouldn't break 2.7 compatibility
>unnecessarily until 2.7 is out of maintenance.  That is: warnings are good,
>removals are bad.  (I haven't read through this issue, so I may be off base.)

This is also a reasonable argument, but not one I've thought about since I'm
using Python 2 only rarely these days.

On Nov 22, 2016, at 07:34 PM, Serhiy Storchaka wrote:

>If you insist I could revert converting warnings to errors (only in
>replacement string or all?) in 3.6.

pattern is a regular expression string so it already follows the syntax as
described in $6.2.1 Regular Expression Syntax.  But I think a reading of that
section (and the "special sequences" bit that follows) could also argue that
unknown escapes shouldn't throw an error.

>But I think they should left errors in 3.7. The earlier we make undefined
>escapes the errors, the earlier we can define new special escape sequences
>without confusing users. It is bad if the escape sequence is valid in two
>Python versions but has different meaning.

Perhaps so, but I do think this is a tricky question from a compatibility
point of view.  One possible optional, although it's late in the cycle, would
be to introduce a new flag so the user could tell re exactly what behavior
they want.  The default would have to be backward compatible (i.e. leave
unknown sequences alone), but there could be say an re.STRICTESCAPES flag that
would cause the error to be thrown.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Matthew Barnett

Matthew Barnett added the comment:

@Barry: repl already supports some escapes, e.g. \g for named groups, 
although not \xXX et al, so deprecating unknown escapes like in the pattern 
makes sense to me.

BTW, the regex module already supports \xXX, \N{XXX}, etc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The reason for disallowing some undefined escapes is the same as in pattern 
strings: this would allow as to introduce new special escape sequences. For 
example:

* \N{...} for named character escape.
* Perl and extended PCRE use \L and \U for making lower and upper casing of the 
replacement. \U is already used for other purpose, but you have an idea.

Of course the need in new special escape sequences in template string is much 
less then in pattern string.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The deprecation was documented in 3.5.

https://docs.python.org/3.5/library/re.html#re.sub

Deprecated since version 3.5, will be removed in version 3.6: Unknown escapes 
consist of '\' and ASCII letter now raise a deprecation warning and will be 
forbidden in Python 3.6.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Barry A. Warsaw

Barry A. Warsaw added the comment:

I disagree that the documentation is at fault.  This is known to break existing 
code, e.g. http://bugs.python.org/msg281496

I think it's not correct to change the documentation but leave the 
error-raising behavior for 3.6 because the deprecation was never documented in 
3.5 so this will look like a gratuitous regression.  issue27030 for reference.

I also question whether it makes sense for such escapes to be illegal in the 
repl argument of re.sub().  I could understand for this limitation in the 
pattern argument, but that's not what's causing the error.

--
nosy: +barry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-11-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Maybe just remove the phrase "Unknown escapes such as \& are left alone"?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-10-16 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
components: +Regular Expressions
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28450] Misleading/inaccurate documentation about unknown escape sequences in regular expressions

2016-10-16 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thank you for your report Lele. Agreed, the documentation looks misleading.

Do you want to provide more clear wording?

--
nosy: +Rosuav, mrabarnett, nedbat, serhiy.storchaka
stage:  -> needs patch
title: Misleading/inaccurate documentation about unknown escape sequences -> 
Misleading/inaccurate documentation about unknown escape sequences in regular 
expressions
type:  -> enhancement
versions: +Python 3.5, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com