[issue19543] Add -3 warnings for codec convenience method changes

2016-02-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Is there something left to do with this issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2016-02-11 Thread Nick Coghlan

Nick Coghlan added the comment:

I think so - if anyone spots another place a Py3k warning could be usefully 
emitted, it can be handled as a new issue.

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-12-25 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
stage: patch review -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-12-03 Thread Roundup Robot

Roundup Robot added the comment:

New changeset c89a0f24d5f6 by Serhiy Storchaka in branch '2.7':
Issue #19543: Added Py3k warning for decoding unicode.
https://hg.python.org/cpython/rev/c89a0f24d5f6

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-31 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 5c8c123943cf by Tal Einat in branch 'default':
Issue #19543: Implementation of isclose as per PEP 485
https://hg.python.org/cpython/rev/5c8c123943cf

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-31 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 0347f6e14ad6 by Tal Einat in branch '3.5':
Issue #19543: Implementation of isclose as per PEP 485
https://hg.python.org/cpython/rev/0347f6e14ad6

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-31 Thread Roundup Robot

Roundup Robot added the comment:

New changeset cf6e782a7f94 by Serhiy Storchaka in branch '2.7':
Issue #19543: Emit deprecation warning for known non-text encodings.
https://hg.python.org/cpython/rev/cf6e782a7f94

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-31 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Committed patch covers a large part of this issue, but not all.

Following patch emits py3k warning for unicode.decode(). For now unicode(u'a', 
'ascii') is forbidden, but u'a'.decode('ascii') is allowed in 2.7.

The risk of false positive in this patch is lower than in emitting warning on 
str.encode(), but is larger than in just committed patch.

--
Added file: http://bugs.python.org/file39576/issue19543_unicode_decode.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-31 Thread Nick Coghlan

Nick Coghlan added the comment:

The last two commit notifications were intended for issue #24270.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-30 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Nick, Benjamin, could you please look at the patch?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-30 Thread Nick Coghlan

Nick Coghlan added the comment:

Serhiy's patch looks to me like it would pragmatically cover all the cases most 
likely to affect porting efforts: using the standard library bytes-bytes 
codecs through the convenience methods.

For a Python 2 backport, I'm slightly more concerned with exposing the argument 
in the constructor signature, but see value in being consistent with Python 3 
if anyone decides to use this to check a custom codec.

The main advantage this approach has over the typecheck based approach is that 
it can correctly warn about data.encode(hex), while a typecheck based 
approach can't distinguish that from text.encode(ascii)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-05-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

What would you say about this Benjamin?

--
nosy: +benjamin.peterson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-04-28 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch that backports issue19619 and issue20404 with changing an 
exception to Py3k warning, and makes necessary changes in other modules and 
tests.

$ ./python -3
Python 2.7.10rc0 (2.7:4234b0dd2a54+, Apr 28 2015, 16:51:51) 
[GCC 4.8.2] on linux2
Type help, copyright, credits or license for more information.
 'abcd'.decode('hex')
__main__:1: DeprecationWarning: 'hex' is not a text encoding; use 
codecs.decode() to handle arbitrary codecs
'\xab\xcd'

--
Added file: 
http://bugs.python.org/file39226/issue19543_blacklist_transforms_py27.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-04-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think we should just backport issue19619 and issue20404.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-04-25 Thread Nick Coghlan

Nick Coghlan added the comment:

For the warnings, it's actually bytes.decode() (et al) that are expected to 
return str, and str.encode() that's expected to return bytes. The codecs 
themselves remain free to do what they want (hence the recommendation to use 
codecs.encode() and codecs.decode() to invoke arbitrary codecs without the type 
constraints of the builtins). Aside from that, those 2 warnings and the 
unicode.decode() one look good.

However, the bytes.encode() warning is the one we determined couldn't be 
treated as a warning as text.encode(ascii) is valid in both Python 2  3, 
it just does a str-str conversion in 2.x, and a str - bytes conversion in 3.x.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-04-14 Thread Ned Deily

Changes by Ned Deily n...@acm.org:


--
stage: needs patch - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2015-04-13 Thread Dustin J. Mitchell

Dustin J. Mitchell added the comment:

This fixes the four cases Nick referred to, although not in the functions 
expected:

 - str.encode no longer exists
 - unicode.decode no longer exists
 - encoders used by str.encode must return bytes (does not apply to 
codecs.encode)
 - decoders used by unicode.decode must return unicode (does not apply to 
codecs.decode)

--
keywords: +patch
nosy: +djmitche
Added file: http://bugs.python.org/file38979/issue19543.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2013-11-11 Thread Brett Cannon

Changes by Brett Cannon br...@python.org:


--
nosy: +brett.cannon

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2013-11-10 Thread Nick Coghlan

New submission from Nick Coghlan:

The long discussion in issue 7475 and some subsequent discussions I had with 
Armin Ronacher have made it clear to me that the key distinction between the 
codec systems in Python 2 and Python 3 is the following differences in type 
signatures of various operations:

Python 2 (8 bit str):

codecs module: object - object
convenience methods: basestring - basestring
available codecs: unicode - str, str - str, unicode - unicode

Python 3 (Unicode str):

codecs module: object - object
convenience methods: str - bytes
available codecs: str - bytes, bytes - bytes, str - str

The significant distinction is the fact that, in Python 2, the convenience 
methods covered all standard library codecs, but for Python 3, the codecs 
module needs to be used directly for the bytes - bytes codecs and the one str 
- str codec (since those codecs no longer satisfy the constraints of the text 
model related convenience methods).

After attempting to implement a 2to3 fixer for these non-Unicode codecs in 
issue 17823, I realised that wouldn't really work properly (since it's a data 
driven error based on the behaviour of the named codec), so I'm rejecting that 
proposal and replacing it with this one for additional Py3k warnings in Python 
2.7.7.

My proposal is to take the following cases and make them produce warnings under 
Python 2.7.7 when Py3k warnings are enabled (remember, these are the 2.7 types, 
not the 3.x ones):

- the str.encode method is called (redirect to codecs.encode to handle 
arbitrary input types in a forward compatible way)

- the unicode.decode method is called (redirect to codecs.decode to handle 
arbitrary input types)

- PyUnicode_AsEncodedString produces something other than an 8-bit string 
(redirect to codecs.encode for arbitrary output types)

- PyUnicode_Decode produces something other than a unicode string (redirect to 
codecs.decode for arbitrary output types)

For the latter two cases, issue 17828 includes updates to the Python 3 error 
messages to similarly redirect to the convenience functions in the codecs 
module. However, the removed convenience methods will continue to simply 
trigger AttributeError in Python 3 with no special casing.

--
components: Interpreter Core
messages: 202512
nosy: ncoghlan
priority: normal
severity: normal
stage: needs patch
status: open
title: Add -3 warnings for codec convenience method changes
type: enhancement
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2013-11-10 Thread Martin Panter

Martin Panter added the comment:

Just thinking the first case might get quite a few false positives. Maybe that 
would still be acceptable, I dunno.

 - the str.encode method is called (redirect to codecs.encode to handle 
 arbitrary input types in a forward compatible way)

I guess you are trying to catch cases like this, which I have come across quite 
a few times:

data.encode(hex)  # data is a byte string

But I think you would also catch cases that depend on Python 2 “str” objects 
automatically converting to Unicode. Here are some examples taken from real 
code:

file_name.encode(utf-8)  # File name parameter may be str or unicode

# Code meant to be compatible with both Python 2 and 3:
?xml . . . encoding=iso-8859-1?.encode(iso-8859-1)
(data %s\n % len(...)).encode(ascii)

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2013-11-10 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 10.11.2013 10:20, Nick Coghlan wrote:
 
 The long discussion in issue 7475 and some subsequent discussions I had with 
 Armin Ronacher have made it clear to me that the key distinction between the 
 codec systems in Python 2 and Python 3 is the following differences in type 
 signatures of various operations:
 
 Python 2 (8 bit str):
 
 codecs module: object - object
 convenience methods: basestring - basestring
 available codecs: unicode - str, str - str, unicode - unicode
 
 Python 3 (Unicode str):
 
 codecs module: object - object
 convenience methods: str - bytes
 available codecs: str - bytes, bytes - bytes, str - str
 
 The significant distinction is the fact that, in Python 2, the convenience 
 methods covered all standard library codecs, but for Python 3, the codecs 
 module needs to be used directly for the bytes - bytes codecs and the one 
 str - str codec (since those codecs no longer satisfy the constraints of 
 the text model related convenience methods).

Please remember that the codec sub-system is extensible. It's
easily possible to add more codecs via registered codec
search functions.

Whatever you add as warning has to be aware of the fact that
there may be codecs in the system that are not part of the
stdlib and which can potentially implement codecs that use
other type combinations that the ones you listed above.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19543] Add -3 warnings for codec convenience method changes

2013-11-10 Thread Nick Coghlan

Nick Coghlan added the comment:

Martin: you're right, it wouldn't be feasible to check for the 8-bit str 
encoding case, since the types of string literals will implicitly change 
between the two versions. However, the latter three cases would be feasible to 
check (the unicode.decode one is particularly pernicious, since it's the 
culprit that can lead to UnicodeEncodeErrors on a decoding operation as Python 
implicitly tries to encode a Unicode string as ASCII).

MAL: The latter two Py3k warnings would be in the same place as the 
corresponding output type errors in Python 3 (i.e. all in unicodeobject.c), so 
they would never trigger for the general codecs machinery.

Python 2 actually already has output type checks in the same place as the 
proposed warnings, it just only checks for basestring rather than anything 
more specific. Those two warnings would just involve adding the more 
restrictive Py3k-style check when -3 was enabled.

A Py3k warning for unicode.decode is just a straight this method won't be 
there any more in Python 3 warning, since there's no way for the conversion 
from Python 2 to Python 3 to implicitly replace a Unicode string with 8-bit 
data the way string literals switch from 8-bit data to Unicode text.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19543
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com