Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread Stephen J. Turnbull
Antoine Pitrou writes:

 > In which cases is this true? Hex is rarely used for ASCII-encoding of
 > binary data, precisely because its efficiency is poor.

MIME quoted-printable, URL-quoting, and XBM come to mind.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread Baptiste Carvello

Victor Stinner a écrit :


I suppose that each codec will have a different list of accepted input and 
output types. Example:


   bz2: encode:bytes->bytes, decode:bytes->bytes
   rot13: encode:str->str, decode:str->str
   hex: encode:bytes->str, decode: str->bytes 


A user point of view: please NO.

This might be more consistent with the semantics, but it forces users to scratch 
their head each time to find out which types are involved. I'd rather all 
methods take and return the same types, independant of codec, that is:


.encode : str->bytes
.decode : bytes->str
.(un)transform : same type, str->str or bytes->bytes

All other uses can be trivially done with .encode('ascii')/.decode('ascii'). 
Changing the type of *ascii* text is easy, understanding bytes vs str semantics 
is not!


Cheers,
B.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread Walter Dörwald
On 09.06.10 14:47, Nick Coghlan wrote:

> On 09/06/10 22:18, Victor Stinner wrote:
>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a écrit :
>>> No, .transform() and .untransform() will be interface to same-type
>>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>>> .encode()/.decode() these helper methods also implement type safety
>>> of the return type.
>>
>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>> Should we use codecs.encode() / codecs.decode() for these types?
> 
> There are probably enough subtleties that this is all worth specifying 
> in a PEP:
> 
> - which codecs from 2.x are to be restored
> - the domain each codec operates in (binary data or text)*
> - review behaviour of codecs.encode and codecs.decode
> - behaviour of the new str, bytes and bytearray (un)transform methods
> - whether to add helper methods for reverse codecs (like base64)
> 
> The PEP would also serve as a reference back to both this discussion and 
> the previous one (which was long enough ago that I've forgotten most of it).

I too think that a PEP is required here.

Codecs support several types of error handling that don't make sense for
transform()/untransform(). What should 'abc'.decode('hex', 'replace')
do? (In 2.6 it raises an assertion error, because errors *must* be strict).

I think we should takt this opportunity to implement
transform/untransform without being burdened with features we inherited
from codecs which don't make sense for transform/untransform.

> [...]

Servus,
   Walter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread M.-A. Lemburg
Walter Dörwald wrote:
> On 09.06.10 14:47, Nick Coghlan wrote:
> 
>> On 09/06/10 22:18, Victor Stinner wrote:
>>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a écrit :
 No, .transform() and .untransform() will be interface to same-type
 codecs, i.e. ones that convert bytes to bytes or str to str. As with
 .encode()/.decode() these helper methods also implement type safety
 of the return type.
>>>
>>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>>> Should we use codecs.encode() / codecs.decode() for these types?
>>
>> There are probably enough subtleties that this is all worth specifying 
>> in a PEP:
>>
>> - which codecs from 2.x are to be restored
>> - the domain each codec operates in (binary data or text)*
>> - review behaviour of codecs.encode and codecs.decode
>> - behaviour of the new str, bytes and bytearray (un)transform methods
>> - whether to add helper methods for reverse codecs (like base64)
>>
>> The PEP would also serve as a reference back to both this discussion and 
>> the previous one (which was long enough ago that I've forgotten most of it).
> 
> I too think that a PEP is required here.

Fair enough. I'll write a PEP.

> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do? (In 2.6 it raises an assertion error, because errors *must* be strict).

That's not really an issue since codecs don't have to implement
all error handling schemes.

For starters, they will all only implement 'strict' mode.

> I think we should takt this opportunity to implement
> transform/untransform without being burdened with features we inherited
> from codecs which don't make sense for transform/untransform.

Not sure what you mean here. Those methods are just helper methods
which interface to the codec system and provide return type safety.
Nothing more or less.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 10 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2010-07-19: EuroPython 2010, Birmingham, UK38 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread Victor Stinner
Le jeudi 10 juin 2010 12:30:01, Walter Dörwald a écrit :
> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do?

You mean 'abc'.transform('hex', 'replace'), right?

Error handler is useful for encoding codecs (the input type is different than 
the output type), but I don't see how it can used with hex, rot13, bz2, ... 
(we decided that .transform() and .untransform() will use the same input and 
output types). Even if bz2+xmlcharref can be something funny :-)

.transform() and .untransform() should have only one argument.

(If you would really like to play with the error handler, you can still use 
codecs.encode(name, errors) and codecs.decode(name, errors).)

.transform() and .untransform() have to be simple. If you want to control the 
codec, why not using directly the real API? Examples:
 - base64.b64encode() has an optional altchars argument
 - bz2.compress() has an optional compresslevel argument
 - etc.

I don't see how altchars or compresslevel can be added to .transform() / 
.untransform(). (**kw would be something really ugly.)

> (In 2.6 it raises an assertion error, because errors *must* be strict)

hex, bz2, rot13, ... codecs should also raise an error if errors is not 
"strict" (or None which means "strict") in Python3.

-- 
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread R. David Murray
On Thu, 10 Jun 2010 12:27:33 +0200, Baptiste Carvello  
wrote:
> Victor Stinner wrote:
> 
> > I suppose that each codec will have a different list of accepted input and
> > output types. Example:
> 
> >bz2: encode:bytes->bytes, decode:bytes->bytes
> >rot13: encode:str->str, decode:str->str
> >hex: encode:bytes->str, decode: str->bytes
> 
> A user point of view: please NO.
> 
> This might be more consistent with the semantics, but it forces users to sc=
> ratch =
> 
> their head each time to find out which types are involved. I'd rather all =
> 
> methods take and return the same types, independant of codec, that is:
> 
> .encode : str->bytes
> .decode : bytes->str
> .(un)transform : same type, str->str or bytes->bytes
> 
> All other uses can be trivially done with .encode('ascii')/.decode('ascii').
> 
> Changing the type of *ascii* text is easy, understanding bytes vs str 
> semantics is not!

+1

Consistency in interface is more important in *this* context than the
sensibleness of any particular transform.

--
R. David Murray  www.bitdance.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Future of 2.x.

2010-06-10 Thread Barry Warsaw
On Jun 10, 2010, at 09:01 AM, Steve Holden wrote:

>The current stumbling block isn't the language itself, it's the lack of
>support from third-party libraries. GSoC is addressing some of these
>issues, but so far we (the PSF, the dev community, anybody else except
>R. David Murray) haven't really come to grips with intractable problems
>like the broken state of the email package, and we are not doing well at
>attracting funds to support it.
>
>So I think we need to address a larger issue than just the language. As
>a development community we decided to change the language. Now we have
>to do what we can to ensure that the changed language has appropriate
>support.

This is exactly my point - I totally agree.  Let's take all that pent up
energy and apply it to porting important libraries to Python 3.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Future of 2.x.

2010-06-10 Thread Terry Reedy

On 6/10/2010 2:48 AM, Senthil Kumaran wrote:

On Thu, Jun 10, 2010 at 6:40 AM, Alexandre Vassalotti
  wrote:

On Wed, Jun 9, 2010 at 1:23 PM, "Martin v. Löwis"  wrote:

Closing the backport requests is fine. For the feature requests, I'd only
close them *after* the 2.7 release (after determining that they won't apply
to 3.x, of course).

There aren't that many backport requests, anyway, are there?



There is only a few requests (about five)


I get your point. It is the 'back-ports' that you have tagged.


Right, things already in 3.x.

> These

were designed for 3.x and implemented in 3.x in the first place.
I was concerned that there will be policy drawn or a practice that
will close any/every existing Feature Request in Python 2.7.
There are some cases (in stdlib) which can debated on the lines of
feature request vs bug-fix and those will get hurt in the process.


I have started going through old open issues tagged with 2.5. Many are 
unclassified. Those that are feature requests that are *plausible* for 
3.2 I am marking as such and retagging for 3.2, *not* closing. (I am 
also marking bug reports as such and asking the OP to test in 2.6/7 and 
maybe 3.1 if I cannot easily do so.)


Ideally, all core/stdlib feature requests should be classified as such 
and tagged for 3.2 or even 3.3) only.


Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs

2010-06-10 Thread Terry Reedy

On 6/10/2010 7:08 AM, M.-A. Lemburg wrote:

Walter Dörwald wrote:



The PEP would also serve as a reference back to both this discussion and
the previous one (which was long enough ago that I've forgotten most of it).


I too think that a PEP is required here.


Fair enough. I'll write a PEP.


Thank you from me.



Codecs support several types of error handling that don't make sense for
transform()/untransform(). What should 'abc'.decode('hex', 'replace')
do? (In 2.6 it raises an assertion error, because errors *must* be strict).


I would expext either ValueError: errors arg must be 'strict' for 
trransform or else TypeError: tranform takes 1 arg, 2 given.



That's not really an issue since codecs don't have to implement
all error handling schemes.

For starters, they will all only implement 'strict' mode.


Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com