[Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Antoine Pitrou
Hello, I've just noticed that in py3k, the decoding functions in the codecs module accept str objects as well as bytes: # import codecs # c = codecs.getdecoder('utf8') # c('aa') ('aa', 2) # c('éé') ('éé', 4) # c = codecs.getdecoder('latin1') # c('aa') ('aa', 2) # c('éé') ('éé', 4)

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Guido van Rossum
Sounds like yet another remnant of the old philosophy, which indeed supported encode and decode operations on both string types. :-( On Wed, Jan 7, 2009 at 5:39 AM, Antoine Pitrou solip...@pitrou.net wrote: Hello, I've just noticed that in py3k, the decoding functions in the codecs module

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Antoine Pitrou
Guido van Rossum guido at python.org writes: Sounds like yet another remnant of the old philosophy, which indeed supported encode and decode operations on both string types. How do we go for fixing it? Is it ok to raise a TypeError in 3.0.1? ___

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Guido van Rossum
That depends a bit on how much code we find that breaks as a result. If you find you have to do a big cleanup in the stdlib after that change, it's likely that 3rd party code could have the same problem, and I'd be reluctant. I'd be okay with adding a warning in that case. OTOH if there's no

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Benjamin Peterson
On Wed, Jan 7, 2009 at 9:46 AM, Guido van Rossum gu...@python.org wrote: A -3 warning should be added to 2.6 about this too IMO. A Py3k warning when attempting to decode a unicode string? Wouldn't that open the door to adding warnings to everywhere a unicode string is used where a byte string

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread M.-A. Lemburg
On 2009-01-07 16:34, Guido van Rossum wrote: Sounds like yet another remnant of the old philosophy, which indeed supported encode and decode operations on both string types. :-( No, that's something I explicitly readded to Python 3k, since the codecs interface is independent of the input and

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Aahz
On Wed, Jan 07, 2009, Antoine Pitrou wrote: Guido van Rossum guido at python.org writes: Sounds like yet another remnant of the old philosophy, which indeed supported encode and decode operations on both string types. How do we go for fixing it? Is it ok to raise a TypeError in 3.0.1?

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Antoine Pitrou
M.-A. Lemburg mal at egenix.com writes: No, that's something I explicitly readded to Python 3k, since the codecs interface is independent of the input and output types (the codecs decide which combinations to support). But why would the utf8 decoder accept unicode as input?

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread M.-A. Lemburg
On 2009-01-07 19:32, Antoine Pitrou wrote: M.-A. Lemburg mal at egenix.com writes: No, that's something I explicitly readded to Python 3k, since the codecs interface is independent of the input and output types (the codecs decide which combinations to support). But why would the utf8

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Guido van Rossum
OK, ignore my previous comment. Sounds like the inidividual codecs need to tighten their type checking though -- perhaps *that* can be fixed in 3.0.1? I really don't see why any codec used to convert between text and bytes should support its output type as input. --Guido On Wed, Jan 7, 2009 at

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Brett Cannon
On Wed, Jan 7, 2009 at 10:57, M.-A. Lemburg m...@egenix.com wrote: [SNIP] BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just like all Unicode support source files). Someone apparently has added tabs when adding support for Py_buffers. It looks like this formatting mix-up is

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Terry Reedy
Guido van Rossum wrote: OK, ignore my previous comment. Sounds like the inidividual codecs need to tighten their type checking though -- perhaps *that* can be fixed in 3.0.1? I really don't see why any codec used to convert between text and bytes should support its output type as input. --Guido

Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Collin Winter
On Wed, Jan 7, 2009 at 2:35 PM, Brett Cannon br...@python.org wrote: On Wed, Jan 7, 2009 at 10:57, M.-A. Lemburg m...@egenix.com wrote: [SNIP] BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just like all Unicode support source files). Someone apparently has added tabs when