Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 Nov 2013 10:47, Victor Stinner victor.stin...@gmail.com wrote: 2013/11/16 Nick Coghlan ncogh...@gmail.com: To address Serhiy's security concerns with the compression codecs (which are technically independent of the question of restoring the aliases), I also plan to document how to

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Victor Stinner
Why not using str type for str and str subtypes, and bytes type for bytes and bytes-like object (bytearray, memoryview)? I don't think that we need an ABC here. Victor Le 16 nov. 2013 10:44, Nick Coghlan ncogh...@gmail.com a écrit : On 16 Nov 2013 10:47, Victor Stinner victor.stin...@gmail.com

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 20:45, Victor Stinner victor.stin...@gmail.com wrote: Why not using str type for str and str subtypes, and bytes type for bytes and bytes-like object (bytearray, memoryview)? I don't think that we need an ABC here. We'd only need an ABC if info was added for supported input

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread M.-A. Lemburg
On 16.11.2013 01:47, Victor Stinner wrote: Adding transform()/untransform() method to bytes and str is a non trivial change and not everybody likes them. Anyway, it's too late for Python 3.4. Just to clarify: I still like the idea of adding those methods. I just don't see what this addition

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Antoine Pitrou
On Sat, 16 Nov 2013 19:44:51 +1000 Nick Coghlan ncogh...@gmail.com wrote: Aye, that was my conclusion (hence my proposal on issue 7475 back in April). Can I take that observation as a +1 for restoring the aliases as well? I see no harm in restoring the aliases personally, so +1 from me.

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 21:38, Nick Coghlan ncogh...@gmail.com wrote: On 16 November 2013 20:45, Victor Stinner victor.stin...@gmail.com wrote: Why not using str type for str and str subtypes, and bytes type for bytes and bytes-like object (bytearray, memoryview)? I don't think that we need an ABC

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 21:49, M.-A. Lemburg m...@egenix.com wrote: On 16.11.2013 01:47, Victor Stinner wrote: Adding transform()/untransform() method to bytes and str is a non trivial change and not everybody likes them. Anyway, it's too late for Python 3.4. Just to clarify: I still like the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread M.-A. Lemburg
On 15.11.2013 08:13, Nick Coghlan wrote: On 15 November 2013 11:10, Terry Reedy tjre...@udel.edu wrote: On 11/14/2013 5:32 PM, Victor Stinner wrote: I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan ncogh...@gmail.com wrote: And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. This is completely the wrong approach. There's zero

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Steven D'Aprano
On Fri, Nov 15, 2013 at 05:13:34PM +1000, Nick Coghlan wrote: A few things I noticed while implementing the recent updates: - as you noted in your other email, while MAL is on record as saying the codecs module is intended for arbitrary codecs, not just Unicode encodings, readers of the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Serhiy Storchaka
15.11.13 12:02, Steven D'Aprano написав(ла): It would be really good to be able to query the available codecs. For example, many applications offer an Encoding menu, where you can specify the codec used for text. That's hard in Python, since you can't retrieve a list of known codecs. And you

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Steven D'Aprano
On Fri, Nov 15, 2013 at 10:22:28AM +0100, Antoine Pitrou wrote: On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan ncogh...@gmail.com wrote: And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 21:28:35 +1100 Steven D'Aprano st...@pearwood.info wrote: One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data, codec) That's a good point. If encoding/decoding is intended to be completely generic (even if 99% of the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Serhiy Storchaka
15.11.13 12:28, Steven D'Aprano написав(ла): One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data, codec) And this is a hole in a security if you don't check codec name before calling a codec. See topic about utilizing zip-bombs via codecs

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 15 November 2013 20:33, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 15 Nov 2013 21:28:35 +1100 Steven D'Aprano st...@pearwood.info wrote: One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data, codec) That's a good point. If

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 21:45:31 +1000 Nick Coghlan ncogh...@gmail.com wrote: The reason I'm now putting some effort into better documenting the status quo for codec handling in Python 3 and filing off some of the rough edges (rather than proposing adding any new APIs to Python 3.x) is because

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Victor Stinner
2013/11/15 Nick Coghlan ncogh...@gmail.com: The reason I'm now putting some effort into better documenting the status quo for codec handling in Python 3 and filing off some of the rough edges (rather than proposing adding any new APIs to Python 3.x) is because the users I care about in this

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Paul Moore
On 15 November 2013 12:07, Victor Stinner victor.stin...@gmail.com wrote: A new API for binary transforms is potentially an academically interesting concept, but it solves zero current real world problems. I would like to reply the same for these codecs: they are not solving any real world

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread M.-A. Lemburg
On 15.11.2013 12:45, Nick Coghlan wrote: On 15 November 2013 20:33, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 15 Nov 2013 21:28:35 +1100 Steven D'Aprano st...@pearwood.info wrote: One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data,

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Facundo Batista
On Thu, Nov 14, 2013 at 7:32 PM, Victor Stinner victor.stin...@gmail.com wrote: I would prefer to split the registry of codecs to have 3 registries: - encoding (a better name can found): encode str=bytes, decode bytes=str - bytes: encode bytes=bytes, decode bytes=bytes - str: encode

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 15 November 2013 22:24, Paul Moore p.f.mo...@gmail.com wrote: On 15 November 2013 12:07, Victor Stinner victor.stin...@gmail.com wrote: A new API for binary transforms is potentially an academically interesting concept, but it solves zero current real world problems. I would like to reply

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 23:50:23 +1000 Nick Coghlan ncogh...@gmail.com wrote: My perspective is that, in current Python, that *is* the right thing for people to do, and any hypothetical new API proposed for Python 3.5 would do nothing to change what's right for Python 3.4 code (or Python 2/3

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 16 November 2013 00:04, Antoine Pitrou solip...@pitrou.net wrote: Rather than the more useful: babcdef.decode(hex) Traceback (most recent call last): File stdin, line 1, in module TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use codecs.decode() to decode to arbitrary

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Stephen J. Turnbull
Walter Dörwald writes: Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka storch...@gmail.com: 15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Ethan Furman
On 11/14/2013 11:13 PM, Nick Coghlan wrote: The proposal I posted to issue 7475 back in April (and, in the absence of any objections to the proposal, finally implemented over the past few weeks) was to take advantage of the fact that the codecs.encode and codecs.decode convenience functions

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Sat, 16 Nov 2013 00:46:15 +1000 Nick Coghlan ncogh...@gmail.com wrote: On 16 November 2013 00:04, Antoine Pitrou solip...@pitrou.net wrote: Rather than the more useful: babcdef.decode(hex) Traceback (most recent call last): File stdin, line 1, in module TypeError: 'hex' decoder

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Walter Dörwald
Am 15.11.2013 um 16:57 schrieb Stephen J. Turnbull step...@xemacs.org: Walter Dörwald writes: Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka storch...@gmail.com: 15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice,

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 16 Nov 2013 02:36, Antoine Pitrou solip...@pitrou.net wrote: On Sat, 16 Nov 2013 00:46:15 +1000 Nick Coghlan ncogh...@gmail.com wrote: On 16 November 2013 00:04, Antoine Pitrou solip...@pitrou.net wrote: Rather than the more useful: babcdef.decode(hex) Traceback (most recent

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Victor Stinner
2013/11/16 Nick Coghlan ncogh...@gmail.com: To address Serhiy's security concerns with the compression codecs (which are technically independent of the question of restoring the aliases), I also plan to document how to systematically blacklist particular codecs in an application by setting

[Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Hi, I saw that Nick Coghlan documented codecs.encode() and codecs.decode(), and changed the exception raised when codecs like rot_13 are used on bytes.decode() and str.encode(). I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because the codecs API raises surprising errors doesn't sound correct. I prefer to fix how codecs are used, than modifying the PyFrameObject. For more

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:34, Victor Stinner victor.stin...@gmail.com wrote: Hi, I saw that Nick Coghlan documented codecs.encode() and codecs.decode(), and changed the exception raised when codecs like rot_13 are used on bytes.decode() and str.encode(). I don't like the functions codecs.encode()

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:42, Victor Stinner victor.stin...@gmail.com wrote: Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because the codecs API raises surprising errors doesn't sound correct. I

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 01:03, Nick Coghlan написав(ла): We already do this check in the existing convenience methods - it raises TypeError. The problem with this check is that it happens *after* encoding/decoding. This opens door for DoS (see my last message).

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 09:11, Nick Coghlan ncogh...@gmail.com wrote: On 15 Nov 2013 08:42, Victor Stinner victor.stin...@gmail.com wrote: Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. If the transform() method will be added, I prefer to have only one transformation method

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 5:32 PM, Victor Stinner wrote: I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We try to avoid this in Python. Such dependence is common with arithmetic. 1 + 2 3 1 + 2.0 3.0 1 + 2+0j

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 6:03 PM, Nick Coghlan wrote: You have to get it out of your head that codecs are just about text and and binary data. 99+% of the current codec module doc leads one to that impression. The fact that codecs are expected to have a file reader and writer and that the default

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Walter Dörwald
Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka storch...@gmail.com: 15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. If the transform()

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 November 2013 11:10, Terry Reedy tjre...@udel.edu wrote: On 11/14/2013 5:32 PM, Victor Stinner wrote: I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We try to avoid this in Python. Such