[issue27364] Deprecate invalid unicode escape sequences

2016-09-03 Thread Martin Panter

Martin Panter added the comment:

Left some comments for invalid_stdlib_escapes_2.patch

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Emanuel Barry

Emanuel Barry added the comment:

Thanks Serhiy; it does look better to me too!

--
Added file: 
http://bugs.python.org/file44322/deprecate_invalid_escapes_both_3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think "invalid escape sequence '\?'" would look cleaner than "invalid escape 
sequence '?'".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Emanuel Barry

Emanuel Barry added the comment:

Ping. I'd like to get this merged in time for 3.6. Is there anything I can do 
to speed up the review?

Since the change itself is very straightforward, I think this would make sense 
to merge it now and then fix the invalid escapes that are found during the beta 
phase.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-08-23 Thread John Mark Vandenberg

Changes by John Mark Vandenberg :


--
nosy: +jayvdb

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-08-14 Thread Emanuel Barry

Changes by Emanuel Barry :


Added file: 
http://bugs.python.org/file44108/deprecate_invalid_escapes_both_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-08-14 Thread Emanuel Barry

Emanuel Barry added the comment:

Here's a new pair of patches for this. There are some small tweaks to the 
tests, and I properly fixed all instances of invalid escapes (I also made some 
strings into raw-strings at some places where it's not needed, solely for 
consistency with surrounding lines or functions). The patch that fixes the 
invalid escapes is four times larger than the previous one.

I would also advise to add to PEP 8 a bit recommending that strings used in 
regular expressions alwaus be raw-strings, even if there's no need to, as a lot 
(at least 70%) of the invalid escapes fixed were used in regexes.

--
Added file: http://bugs.python.org/file44107/invalid_stdlib_escapes_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-08-11 Thread Emanuel Barry

Emanuel Barry added the comment:

Hmm, that's odd, I recall some of the failures from testing, and thought I 
fixed them. Some of these are brand new, though, so thanks! I'll run and fix 
the tests (and modules as well); should likely have a patch by the weekend :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-08-11 Thread Martin Panter

Martin Panter added the comment:

I am trying out your patch at the moment. There are plenty of test suite 
failures; I ran the test suite with approximately the following:

./python -bWerror -m test -Wr -j0 -u network -x 
test_{mailbox,shelve,faulthandler,multiprocessing_main_handling,venv,warnings}

Importing modules sometimes fails or generates the warning, but this goes away 
if the file is not out of date. E.g. run “touch Lib/test/test_codecs.py”, and 
then make sure you next import that module with -Wall or -Werror enabled.

374 tests OK.
10 tests failed:
test___all__ test_ast test_codecs test_doctest test_fstring
test_idle test_strlit test_trace test_unicode
test_zipimport_support

I started pasting some of the failures here, but gave up as more and more 
failed. Let me know if you want the full details.

==
ERROR: test_coverage (test.test_trace.TestCoverage)
--
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/test/test_trace.py", line 312, 
in test_coverage
self._coverage(tracer)
  File "/media/disk/home/proj/python/cpython/Lib/test/test_trace.py", line 307, 
in _coverage
r.write_results(show_missing=True, summary=True, coverdir=TESTFN)
  File "/media/disk/home/proj/python/cpython/Lib/trace.py", line 284, in 
write_results
lnotab = _find_executable_linenos(filename)
  File "/media/disk/home/proj/python/cpython/Lib/trace.py", line 403, in 
_find_executable_linenos
code = compile(prog, filename, "exec")
DeprecationWarning: invalid escape sequence 'w'
**
File "/media/disk/home/proj/python/cpython/Lib/test/test_doctest.py", line 288, 
in test.test_doctest.test_DocTest
Failed example:
docstring = '''
>>> print(12)
12

Non-example text.

>>> print('another\example')
another
example
'''
Exception raised:
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/doctest.py", line 1330, in 
__run
compileflags, 1), test.globs)
DeprecationWarning: invalid escape sequence 'e'
**
[Many subsequent NameError exceptions from test_doctest]
**
File "/tmp/tmphzbypj98/test_zip.zip/test_zipped_doctest.py", line 288, in 
test_zipped_doctest.test_DocTest
Failed example:
docstring = '''
>>> print(12)
12

Non-example text.

>>> print('another\example')
another
example
'''
Exception raised:
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/doctest.py", line 1330, in 
__run
compileflags, 1), test.globs)
DeprecationWarning: invalid escape sequence 'e'
**
[More failures]

==
FAIL: test_all (test.test___all__.AllTest)
--
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/test/test___all__.py", line 
105, in test_all
self.check_all(modname)
  File "/media/disk/home/proj/python/cpython/Lib/test/test___all__.py", line 
28, in check_all
raise FailedImport(modname)
  File "/media/disk/home/proj/python/cpython/Lib/contextlib.py", line 89, in 
__exit__
next(self.gen)
  File "/media/disk/home/proj/python/cpython/Lib/test/support/__init__.py", 
line 1130, in _filterwarnings
raise AssertionError("unhandled warning %s" % reraise[0])
AssertionError: unhandled warning {message : DeprecationWarning("invalid escape 
sequence '('",), category : 'DeprecationWarning', filename : 
'/media/disk/home/proj/python/cpython/Lib/importlib/_bootstrap.py', lineno : 
222, line : None}

==
ERROR: test_escape_order (test.test_fstring.TestCase) (str='f\'{"a"\\!r}\'')
--
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/test/test_fstring.py", line 
20, in assertAllRaise
eval(str)
DeprecationWarning: invalid escape sequence '!'

==
ERROR: test_escape (test.test_codecs.EscapeDecodeTest)
--
Traceback (most recent call last):
  File "/media/disk/home/proj/python/cpython/Lib/test/test_codecs.py", line 
1218, in test_escape
decode(b"\\" + b)
OverflowError: character argument not in range(0x11)

==
ERROR: test_escape_decode 

[issue27364] Deprecate invalid unicode escape sequences

2016-07-18 Thread Emanuel Barry

Emanuel Barry added the comment:

Here's a new patch which also deprecates invalid escape sequences in bytes. 
Tests included with test_codecs.

Patch includes and supersedes deprecate_invalid_escapes_only_3.patch, and I 
have not found a single instance of an invalid escape sequence other than in 
test_codecs, so this should be fine now.

--
Added file: 
http://bugs.python.org/file43777/deprecate_invalid_escapes_both_1.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Martin Panter

Martin Panter added the comment:

Forgot to say I reviewed invalid_stdlib_escapes_1.patch the other day and can’t 
see any problems.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry

Emanuel Barry added the comment:

Just brought this to the attention of the code-quality mailing list, so linter 
maintainers should (hopefully!) catch up soon.

Also new patch, I forgot to add '\c' in the tests.

--
Added file: 
http://bugs.python.org/file43569/deprecate_invalid_escapes_only_3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry

Emanuel Barry added the comment:

Easing transition is always a good idea. I'll contact the PyCQA people later 
today when I'm back home.

On afterthought, it makes sense to wait more than two release cycles before 
making this an error. I don't really have a strong opinion when exactly that 
should happen.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Guido van Rossum

Guido van Rossum added the comment:

I think ultimately it has to become an error (otherwise I wouldn't
have agreed to the warning, silent or not). But because there's so
much 3rd party code that depends on it we indeed need to take
"several" releases before we go there.

Contacting the PyCQA folks would also be a great idea -- can anyone
volunteer to do so?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread R. David Murray

R. David Murray added the comment:

Yes, this change is likely to break a lot of code, so an extended deprecation 
period (certainly longer than 3.7, which Guido has already mandated) is the 
minimum).  Guido hasn't agreed to making it an error yet, as far as I can see ;)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread STINNER Victor

STINNER Victor added the comment:

@ebarry: To move faster, you should also worker with linters (pylint, 
pychecker, pyflakes, pycodestyle, flake8, ...) to log a warning to help 
projects to be prepared this change. linters are used on Python 2-only 
projects, so it will help them to be prepared to the final Python 3. which 
will raise an exception.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry

Emanuel Barry added the comment:

I think ultimately a SyntaxError should be fine. I don't know *when* it becomes 
appropriate to change a warning into an error; I was thinking 3.7 but, as 
Serhiy said, there's no rush. I think waiting five release cycles is overkill 
though, that means the error won't be until 8 years from now (assuming release 
cycle periods don't change)! I think at most 3.8 should be fine for making this 
a full-on syntax error.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

DeprecationWarning is used when we want to remove a feature. It becomes an 
error in the future. FutureWarning is used when we want change the meaning of a 
feature instead of removing it. For example re.split(':*', 'a:bc') emits a 
FutureWarning and returns ['a', 'bc'] because there is a plan to make it 
returning ['', 'a', 'b', 'c', ''].

I think "a silent warning" means that it should emit a DeprecationWarning or a 
PendingDeprecationWarning. Since there is no haste, we should use 2-releases 
deprecation period. After this a deprecation can be changed to a SynataxWarning 
in 3.8 and to a UnicodeDecodeError (for strings) and a ValueError (for bytes) 
in 4.0. The latter are converted to SyntaxError by parser. At the end we should 
get the same behavior as for truncated \x and \u escapes.

>>> '\u'
  File "", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in 
position 0-1: truncated \u escape
>>> b'\x'
  File "", line 1
SyntaxError: (value error) invalid \x escape at position 0

Maybe change a parser to convert warnings to a SyntaxWarning?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread STINNER Victor

STINNER Victor added the comment:

Guido: "I am okay with making it a silent warning."

The current patch raises a DeprecationWarning which is silent by default, but 
seen using python3 -Wd. What is the "long term" plan: always raise an 
*exception* in Python 3.7? Which exception?

Another option is to always emit a SyntaxWarning, but don't raise an exception 
in long term. It is possible to get an exception using python3 -Werror.

There is also FutureWarning: "Base class for warnings about constructs that 
will change semantically in the future" or RuntimeWarning "Base class for 
warnings about dubious runtime behavior".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter

Martin Panter added the comment:

Code samples in the documentation should also be fixed, like at 
. I think you can run 
“make -C Doc doctest” or something similar, which may help find some of these.

Also, playing with your current patch, it seems to affect the “unicode-escape” 
codec. Not sure if that is a problem, but it probably deserves also documenting 
the change.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry

Emanuel Barry added the comment:

Indeed, we did, thanks for letting me know my mistake :) I didn't get very far 
into making bytes literal disallow invalid sequences, as I ran into issues with 
_codecs.escape_decode throwing the warning even when the literal was fine, and 
I think I stopped there and figured I'd at least post that patch and see if 
people are interested in extending that modification to bytes (turns out so).

I forgot about docs, will do so soon, but I'll try to extend the patch for 
bytes first. I'll see if I can make literals warn but not e.g. 
_codecs.escape_decode (or anything else, really).

Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter

Martin Panter added the comment:

Hah, we posted the same fix almost at the same time :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter

Martin Panter added the comment:

Hello Emanual, I think I have fixed your problem with -Werror, by handling the 
exception returned by PyErr_WarnFormat() (see my patch). Thanks for separating 
the actual change from the escape violation fixes; it made it easier to spot 
the real problem :)

Also, I like the general idea of the change. It would be good to update the 
documentation as well (e.g. What’s New, and 
).

It would be good to do the same for byte string literals, at least to keep 
things consistent. What did you try so far? Do you have a partial patch for it?

--
nosy: +martin.panter
Added file: 
http://bugs.python.org/file43553/deprecate_invalid_escapes_only_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry

Emanuel Barry added the comment:

Aaand I feel pretty stupid; I didn't check the return value of 
PyErr_WarnFormat, so it was my mistake. Attached new patch, actually done right 
this time.

--
Added file: 
http://bugs.python.org/file43552/deprecate_invalid_escapes_only_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry

Emanuel Barry added the comment:

Ah right, assert() is only enabled in debug mode, I forgot that. My (very 
uneducated) guess is that compile() got the error (which was a warning) but 
then decided to return a value anyway, and the next thing that tries to call 
anything crashes Python. I opened #27394 to get some experts' advice.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Guido van Rossum

Guido van Rossum added the comment:

Hm, if you manage to trigger an assert() in the C code by writing some evil
Python code, the C code is considered broken (unless it was using ctypes or
one or two other explicit "void-the-warranty" exceptions).

Maybe someone who has worked more with the C code recently could help you
dig into this more; my memory is unreliable when it comes to these details.
Maybe assert() calls are disabled by default? In general the error "...
returned a result with an error set" means there's a problem at the C level
where a function should have either returned an object or returned NULL
with the per-thread exception state set, but it was found to return an
object *and* set the exception state. IIRC only debug mode checks for that,
so such a bug occasionally creeps into the code. But you shouldn't assume
everything is fine until you've tracked down the cause.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry

Emanuel Barry added the comment:

I originally considered making two different patches, so there you go. 
deprecate_invalid_escapes_only_1.patch has the deprecation plus a test, and 
invalid_stdlib_escapes_1.patch fixes all invalid escapes in the stdlib.

My code was the cause, although no directly; it was 'assert(!PyErr_Occurred())' 
at the beginning of PyObject_Call in Objects/abstract.c which failed.

This happened when I ran the whole test suite (although just running test_ast 
was fine to reproduce it) with the '-W error' command line switch. One stdlib 
module (I don't remember which one) had one single invalid escape sequence in 
it, and then test_ast.ASTValidatorTests.test_stdlib_validates triggered the 
failed assertion. Fixing the invalid escape removes the failure and all tests 
pass.

One can reliably reproduce the crash with the patch by adding a string with an 
invalid escape in any of the stdlib files (and running with '-W error'):

No invalid sequence:

>>> import unittest, test.test_ast
>>> unittest.main(test.test_ast)
..
--
Ran 78 tests in 5.538s

OK

With an invalid sequence in a file:

>>> import unittest, test.test_ast
>>> unittest.main(test.test_ast)
Fatal Python error: a function 
returned a result with an error set
DeprecationWarning: invalid escape sequence 'w'

During handling of the above exception, another exception occurred:

SystemError:  returned a result with an error set

Current thread 0x1ba0 (most recent call first):
  File "E:\GitHub\cpython\lib\ast.py", line 35 in parse
  File "E:\GitHub\cpython\lib\test\test_ast.py", line 944 in 
test_stdlib_validates
  File "E:\GitHub\cpython\lib\unittest\case.py", line 600 in run
  File "E:\GitHub\cpython\lib\unittest\case.py", line 648 in __call__
  File "E:\GitHub\cpython\lib\unittest\suite.py", line 122 in run
  File "E:\GitHub\cpython\lib\unittest\suite.py", line 84 in __call__
  File "E:\GitHub\cpython\lib\unittest\suite.py", line 122 in run
  File "E:\GitHub\cpython\lib\unittest\suite.py", line 84 in __call__
  File "E:\GitHub\cpython\lib\unittest\runner.py", line 176 in run
  File "E:\GitHub\cpython\lib\unittest\main.py", line 255 in runTests
  File "E:\GitHub\cpython\lib\unittest\main.py", line 94 in __init__
  File "", line 1 in 

Then I get the usual "Python has stopped working" Windows prompt (strangely 
enough, before I'd get a prompt saying "Assertion failed" with the line, but 
not this time).

I'm not sure where the error lies exactly. Should I open another issue for that?

--
Added file: 
http://bugs.python.org/file43549/deprecate_invalid_escapes_only_1.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry

Changes by Emanuel Barry :


Added file: http://bugs.python.org/file43550/invalid_stdlib_escapes_1.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Guido van Rossum

Guido van Rossum added the comment:

I am okay with making it a silent warning.

Can we do it in two stages though? It doesn't have to be two releases, I just 
mean two separate commits: (1) fix all places in the stdlib that violate this 
principle; (2) separately commit the code that causes the silent deprecation 
(and tests for it).

What exactly was the hard crash you got? Do you think it was a bug in your own 
C code or in existing C code?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-24 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry

Emanuel Barry added the comment:

I found the cause of the failed assertion, an invalid escape sequence slipped 
through in a file. Patch attached (also with Serhiy's comments).

It worries me a little though that pure Python code can cause a hard crash. Ok, 
it worries me a lot. Please don't merge this until it's fixed. I'm guessing 
this is a combination of unittest catching warnings and compiling the faulty 
source file. As to why a malformed node (i.e. one that raised a 
DeprecationWarning) managed to pass through unharmed is beyond me.

--
Added file: 
http://bugs.python.org/file43527/deprecate_invalid_unicode_escapes_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry

Emanuel Barry added the comment:

Thanks, didn't find that one. Apparently Guido's stance is "Make this a silent 
warning, then we can discuss about preventing it later", which happens to be 
what I'm doing here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There was a long discussion on Python-Dev. [1]  Guido taken part in it.

[1] http://comments.gmane.org/gmane.comp.python.devel/151612

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry

Emanuel Barry added the comment:

Yes, it's in use in an awful lot of places (see my patch). The proper fix is to 
use raw strings, or, if you need actual escapes in the same string, manually 
escape them. However, as you'll see by looking at the patch, the vast majority 
of cases are fixed by prepending a single 'r' to the front of the string. In 
fact, only csv.py and html/parser.py needed more finer-grained escaping.

I think that the argument "It works in non-raw strings" is weak. I've always 
used raw strings for regular expressions, and this patch would simply move this 
from being a style issue to being a syntax one (and I think it's fine :).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Antti Haapala

Antti Haapala added the comment:

it is handy to be able to use `\w` and `\d` in non-raw-string *regular 
expressions*, without too much backslashitis. Seems to be in use in Python 
standard library as well, for example in csv.py

--
nosy: +ztane

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry

Emanuel Barry added the comment:

Now I have! I found nothing on Python-Dev, but apparently it's been discussed 
on Python-ideas before: 
https://mail.python.org/pipermail/python-ideas/2015-August/035031.html Guido 
hasn't participated in that discussion, and most of it was "This will break 
people's code", with people both for and against the idea, without an apparent 
consensus.

Should I try a second round on Python-ideas, to try and get a consensus (or a 
BDFL ruling)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread R. David Murray

R. David Murray added the comment:

Have you searched the python-dev and python-ideas archives for the previous 
discussions of this issue?  I don't remember for sure, but I think Guido might 
have made a ruling (not that the discussion couldn't be reopened if he has, 
but, well...)

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27364] Deprecate invalid unicode escape sequences

2016-06-21 Thread Emanuel Barry

New submission from Emanuel Barry:

Attached patch deprecates invalid escape sequences in unicode strings. The 
point of this is to prevent issues such as #27356 (and possibly other similar 
ones) in the future.

Without the patch:

>>> "hello \world"
'hello \\world'

With the patch:

>>> "hello \world"
DeprecationWarning: invalid escape sequence 'w'

I'll need some help (patch isn't mergeable yet):

test_doctest fails on my machine with the patch (and -W), and I don't know how 
to fix it. test_ast fails an assertion (!PyErr_Occurred() in PyObject_Call in 
abstract.c) when -W is on, and I also don't know how to fix it (I don't even 
know what causes it).

Of course, I went ahead and fixed all instances of invalid escape sequences in 
the stdlib (that I could find) so that no DeprecationWarning is encountered.

Lastly, I thought about also doing this to bytes, but I ran into some issues 
with some invalid escapes such as \u, and _codecs.escape_decode would trigger 
the warning when passed br"\8" (for example). Ultimately, I decided to leave 
bytes alone for now, since it's mostly on the lower-level side of things. If 
there's interest I can add it back.

--
components: Interpreter Core, Library (Lib), Unicode
files: deprecate_invalid_unicode_escapes.patch
keywords: patch
messages: 269022
nosy: ebarry, ezio.melotti, haypo
priority: normal
severity: normal
stage: patch review
status: open
title: Deprecate invalid unicode escape sequences
type: behavior
versions: Python 3.6
Added file: 
http://bugs.python.org/file43499/deprecate_invalid_unicode_escapes.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com