Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
Antoine Pitrou writes: > In which cases is this true? Hex is rarely used for ASCII-encoding of > binary data, precisely because its efficiency is poor. MIME quoted-printable, URL-quoting, and XBM come to mind. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
Victor Stinner a écrit :
I suppose that each codec will have a different list of accepted input and
output types. Example:
bz2: encode:bytes->bytes, decode:bytes->bytes
rot13: encode:str->str, decode:str->str
hex: encode:bytes->str, decode: str->bytes
A user point of view: please NO.
This might be more consistent with the semantics, but it forces users to scratch
their head each time to find out which types are involved. I'd rather all
methods take and return the same types, independant of codec, that is:
.encode : str->bytes
.decode : bytes->str
.(un)transform : same type, str->str or bytes->bytes
All other uses can be trivially done with .encode('ascii')/.decode('ascii').
Changing the type of *ascii* text is easy, understanding bytes vs str semantics
is not!
Cheers,
B.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
On 09.06.10 14:47, Nick Coghlan wrote:
> On 09/06/10 22:18, Victor Stinner wrote:
>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a écrit :
>>> No, .transform() and .untransform() will be interface to same-type
>>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>>> .encode()/.decode() these helper methods also implement type safety
>>> of the return type.
>>
>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>> Should we use codecs.encode() / codecs.decode() for these types?
>
> There are probably enough subtleties that this is all worth specifying
> in a PEP:
>
> - which codecs from 2.x are to be restored
> - the domain each codec operates in (binary data or text)*
> - review behaviour of codecs.encode and codecs.decode
> - behaviour of the new str, bytes and bytearray (un)transform methods
> - whether to add helper methods for reverse codecs (like base64)
>
> The PEP would also serve as a reference back to both this discussion and
> the previous one (which was long enough ago that I've forgotten most of it).
I too think that a PEP is required here.
Codecs support several types of error handling that don't make sense for
transform()/untransform(). What should 'abc'.decode('hex', 'replace')
do? (In 2.6 it raises an assertion error, because errors *must* be strict).
I think we should takt this opportunity to implement
transform/untransform without being burdened with features we inherited
from codecs which don't make sense for transform/untransform.
> [...]
Servus,
Walter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
Walter Dörwald wrote:
> On 09.06.10 14:47, Nick Coghlan wrote:
>
>> On 09/06/10 22:18, Victor Stinner wrote:
>>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a écrit :
No, .transform() and .untransform() will be interface to same-type
codecs, i.e. ones that convert bytes to bytes or str to str. As with
.encode()/.decode() these helper methods also implement type safety
of the return type.
>>>
>>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>>> Should we use codecs.encode() / codecs.decode() for these types?
>>
>> There are probably enough subtleties that this is all worth specifying
>> in a PEP:
>>
>> - which codecs from 2.x are to be restored
>> - the domain each codec operates in (binary data or text)*
>> - review behaviour of codecs.encode and codecs.decode
>> - behaviour of the new str, bytes and bytearray (un)transform methods
>> - whether to add helper methods for reverse codecs (like base64)
>>
>> The PEP would also serve as a reference back to both this discussion and
>> the previous one (which was long enough ago that I've forgotten most of it).
>
> I too think that a PEP is required here.
Fair enough. I'll write a PEP.
> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do? (In 2.6 it raises an assertion error, because errors *must* be strict).
That's not really an issue since codecs don't have to implement
all error handling schemes.
For starters, they will all only implement 'strict' mode.
> I think we should takt this opportunity to implement
> transform/untransform without being burdened with features we inherited
> from codecs which don't make sense for transform/untransform.
Not sure what you mean here. Those methods are just helper methods
which interface to the codec system and provide return type safety.
Nothing more or less.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Jun 10 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
2010-07-19: EuroPython 2010, Birmingham, UK38 days to go
::: Try our new mxODBC.Connect Python Database Interface for free !
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
Le jeudi 10 juin 2010 12:30:01, Walter Dörwald a écrit :
> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do?
You mean 'abc'.transform('hex', 'replace'), right?
Error handler is useful for encoding codecs (the input type is different than
the output type), but I don't see how it can used with hex, rot13, bz2, ...
(we decided that .transform() and .untransform() will use the same input and
output types). Even if bz2+xmlcharref can be something funny :-)
.transform() and .untransform() should have only one argument.
(If you would really like to play with the error handler, you can still use
codecs.encode(name, errors) and codecs.decode(name, errors).)
.transform() and .untransform() have to be simple. If you want to control the
codec, why not using directly the real API? Examples:
- base64.b64encode() has an optional altchars argument
- bz2.compress() has an optional compresslevel argument
- etc.
I don't see how altchars or compresslevel can be added to .transform() /
.untransform(). (**kw would be something really ugly.)
> (In 2.6 it raises an assertion error, because errors *must* be strict)
hex, bz2, rot13, ... codecs should also raise an error if errors is not
"strict" (or None which means "strict") in Python3.
--
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
On Thu, 10 Jun 2010 12:27:33 +0200, Baptiste Carvello
wrote:
> Victor Stinner wrote:
>
> > I suppose that each codec will have a different list of accepted input and
> > output types. Example:
>
> >bz2: encode:bytes->bytes, decode:bytes->bytes
> >rot13: encode:str->str, decode:str->str
> >hex: encode:bytes->str, decode: str->bytes
>
> A user point of view: please NO.
>
> This might be more consistent with the semantics, but it forces users to sc=
> ratch =
>
> their head each time to find out which types are involved. I'd rather all =
>
> methods take and return the same types, independant of codec, that is:
>
> .encode : str->bytes
> .decode : bytes->str
> .(un)transform : same type, str->str or bytes->bytes
>
> All other uses can be trivially done with .encode('ascii')/.decode('ascii').
>
> Changing the type of *ascii* text is easy, understanding bytes vs str
> semantics is not!
+1
Consistency in interface is more important in *this* context than the
sensibleness of any particular transform.
--
R. David Murray www.bitdance.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Future of 2.x.
On Jun 10, 2010, at 09:01 AM, Steve Holden wrote: >The current stumbling block isn't the language itself, it's the lack of >support from third-party libraries. GSoC is addressing some of these >issues, but so far we (the PSF, the dev community, anybody else except >R. David Murray) haven't really come to grips with intractable problems >like the broken state of the email package, and we are not doing well at >attracting funds to support it. > >So I think we need to address a larger issue than just the language. As >a development community we decided to change the language. Now we have >to do what we can to ensure that the changed language has appropriate >support. This is exactly my point - I totally agree. Let's take all that pent up energy and apply it to porting important libraries to Python 3. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Future of 2.x.
On 6/10/2010 2:48 AM, Senthil Kumaran wrote: On Thu, Jun 10, 2010 at 6:40 AM, Alexandre Vassalotti wrote: On Wed, Jun 9, 2010 at 1:23 PM, "Martin v. Löwis" wrote: Closing the backport requests is fine. For the feature requests, I'd only close them *after* the 2.7 release (after determining that they won't apply to 3.x, of course). There aren't that many backport requests, anyway, are there? There is only a few requests (about five) I get your point. It is the 'back-ports' that you have tagged. Right, things already in 3.x. > These were designed for 3.x and implemented in 3.x in the first place. I was concerned that there will be policy drawn or a practice that will close any/every existing Feature Request in Python 2.7. There are some cases (in stdlib) which can debated on the lines of feature request vs bug-fix and those will get hurt in the process. I have started going through old open issues tagged with 2.5. Many are unclassified. Those that are feature requests that are *plausible* for 3.2 I am marking as such and retagging for 3.2, *not* closing. (I am also marking bug reports as such and asking the OP to test in 2.6/7 and maybe 3.1 if I cannot easily do so.) Ideally, all core/stdlib feature requests should be classified as such and tagged for 3.2 or even 3.3) only. Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13, ... codecs
On 6/10/2010 7:08 AM, M.-A. Lemburg wrote:
Walter Dörwald wrote:
The PEP would also serve as a reference back to both this discussion and
the previous one (which was long enough ago that I've forgotten most of it).
I too think that a PEP is required here.
Fair enough. I'll write a PEP.
Thank you from me.
Codecs support several types of error handling that don't make sense for
transform()/untransform(). What should 'abc'.decode('hex', 'replace')
do? (In 2.6 it raises an assertion error, because errors *must* be strict).
I would expext either ValueError: errors arg must be 'strict' for
trransform or else TypeError: tranform takes 1 arg, 2 given.
That's not really an issue since codecs don't have to implement
all error handling schemes.
For starters, they will all only implement 'strict' mode.
Terry Jan Reedy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
