[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2014-03-13 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d7950e916f20 by R David Murray in branch '3.3':
#7475: Remove references to '.transform' from transform codec docstrings.
http://hg.python.org/cpython/rev/d7950e916f20

New changeset 83d54ab5c696 by R David Murray in branch 'default':
Merge #7475: Remove references to '.transform' from transform codec docstrings.
http://hg.python.org/cpython/rev/83d54ab5c696

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2014-01-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Docstrings for new codecs mention bytes.transform() and bytes.untransform() 
which are nonexistent.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2014-01-02 Thread Jakub Wilk

Changes by Jakub Wilk jw...@jwilk.net:


--
nosy: +jwilk

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-22 Thread Nick Coghlan

Nick Coghlan added the comment:

The 3.4 portion of issue 19619 has been addressed, so removing it as a 
dependency again.

--
dependencies:  -Blacklist base64, hex, ... codecs from bytes.decode() and 
str.encode()

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-22 Thread Nick Coghlan

Nick Coghlan added the comment:

With issue 19619 resolved for Python 3.4 (the issue itself remains open 
awaiting a backport to 3.3), Victor has softened his stance on this topic and 
given the go ahead to restore the codec aliases: 
http://bugs.python.org/issue19619#msg203897

I'll be committing this shortly, after adjusting the patch to account for the 
issue 19619 changes to the tests and What's New.

--
assignee:  - ncoghlan
versions: +Python 3.4 -Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-22 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 5e960d2c2156 by Nick Coghlan in branch 'default':
Close #7475: Restore binary  text transform codecs
http://hg.python.org/cpython/rev/5e960d2c2156

--
nosy: +python-dev
resolution:  - fixed
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-22 Thread Nick Coghlan

Nick Coghlan added the comment:

Note that I still plan to do a documentation-only PEP for 3.4, proposing some 
adjustments to the way the codecs module is documented, making binary and test 
transform defined terms in the glossary, etc.

I'll probably aim for beta 2 for that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-21 Thread Nick Coghlan

Changes by Nick Coghlan ncogh...@gmail.com:


--
dependencies: +Blacklist base64, hex, ... codecs from bytes.decode() and 
str.encode()

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-19 Thread Nick Coghlan

Nick Coghlan added the comment:

Victor is still -1, so to Python 3.5 it goes.

--
versions: +Python 3.5 -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-16 Thread Nick Coghlan

Nick Coghlan added the comment:

Attached patch restores the aliases for the binary and text transforms, adds a 
test to ensure they exist and restores the Aliases column to the relevant 
tables in the documentation. It also updates the relevant section in the What's 
New document.

I also tweaked the wording in the docs to use the phrases binary transform 
and text transform for the affected tables and version added/changed notices.

Given the discussions on python-dev, the main condition that needs to be met 
before I commit this is for Victor to change his current -1 to a -0 or higher.

--
Added file: 
http://bugs.python.org/file32663/issue7475_restore_codec_aliases_in_py34.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-10 Thread Nick Coghlan

Nick Coghlan added the comment:

Issue 17823 is now closed, but not because it has been implemented. It turns 
out that the data driven nature of the incompatibility means it isn't really 
amenable to being detected and fixed automatically via 2to3.

Issue 19543 is a replacement proposal for the introduction of some additional 
codec related Py3k warnings in Python 2.7.7.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-06 Thread Nick Coghlan

Nick Coghlan added the comment:

Providing the 2to3 fixers in issue 17823 now depends on this issue rather than 
the other way around (since not having to translate the names simplifies the 
fixer a bit).

--
dependencies:  -2to3 fixers for missing codecs

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-11-04 Thread Nick Coghlan

Nick Coghlan added the comment:

For anyone interested, I have a patch up on issue 17828 that produces the 
following output for various codec usage errors:

 import codecs
 codecs.encode(bhello, bz2_codec).decode(bz2_codec)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: 'bz2_codec' decoder returned 'bytes' instead of 'str'; use 
codecs.decode to decode to arbitrary types

 hello.encode(bz2_codec)
TypeError: 'str' does not support the buffer interface

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: invalid input type for 'bz2_codec' codec (TypeError: 'str' does not 
support the buffer interface)

 hello.encode(rot_13)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: 'rot_13' encoder returned 'str' instead of 'bytes'; use 
codecs.encode to encode to arbitrary types

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-10-02 Thread Nick Coghlan

Nick Coghlan added the comment:

With issue 17839 fixed, the error from invoking the base64 codec through the 
method API is now substantially more sensible:

 bZXhhbXBsZQ==\n.decode(base64_codec)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: decoder did not return a str object (type=bytes)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-10-02 Thread Nick Coghlan

Nick Coghlan added the comment:

I just wanted to note something I realised in chatting to Armin Ronacher 
recently: in both Python 2.x and 3.x, the encode/decode method APIs are 
constrained by the text model, it's just that in 2.x that model was effectively 
basestring-basestring, and thus still covered every codec in the standard 
library. This greatly limited the use cases for the codecs.encode/decode 
convenience functions, which is why the fact they were undocumented went 
unnoticed.

In 3.x, the changed text model meant the method API become limited to the 
Unicode codecs, making the function based API more important.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-10-02 Thread Nick Coghlan

Nick Coghlan added the comment:

We should fix the docs for the earlier versions as well.

--
versions: +Python 2.7, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-10-02 Thread Nick Coghlan

Changes by Nick Coghlan ncogh...@gmail.com:


--
Removed message: http://bugs.python.org/msg198847

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-10-02 Thread Nick Coghlan

Changes by Nick Coghlan ncogh...@gmail.com:


--
versions:  -Python 2.7, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-05-02 Thread Martin Morrison

Changes by Martin Morrison m...@ensoft.co.uk:


--
nosy: +isoschiz

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-25 Thread Nick Coghlan

Nick Coghlan added the comment:

Also adding 17839 as a dependency, since part of the reason the base64 errors 
in particular are so cryptic is because the base64 module doesn't accept 
arbitrary PEP 3118 compliant objects as input.

--
dependencies: +base64 module should use memoryview

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-25 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
dependencies: +2to3 fixers for missing codecs

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-25 Thread Nick Coghlan

Nick Coghlan added the comment:

I also created issue 17841 to cover that that the 3.3 documentation incorrectly 
states that these aliases still exist, even though they were removed before 3.2 
was released.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-25 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
dependencies: +Add link to alternatives for bytes-to-bytes codecs

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-25 Thread Guido van Rossum

Changes by Guido van Rossum gu...@python.org:


--
nosy:  -gvanrossum

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Martin v . Löwis

Martin v. Löwis added the comment:

I don't see any point in merely bringing the codecs back, without any 
convenience API to use them. If I need to do

  import codecs
  result = codecs.getencoder(base64).encode(data)

I don't think people would actually prefer this over

  import base64
  result = base64.encodebytes(data)

I't (IMO) only the convenience method (.encode) that made people love these 
codecs.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Ezio Melotti

Ezio Melotti added the comment:

IMHO it's also a documentation problem.  Once people figure out that they can't 
use encode/decode anymore, it's not immediately clear what they should do 
instead.  By reading the codecs docs[0] it's not obvious that it can be done 
with codecs.getencoder(...).encode/decode, so people waste time finding a 
solution, get annoyed, and blame Python 3 because it removed a simple way to 
use these codecs without making clear what should be used instead.
FWIW I don't care about having to do an extra import, but indeed something 
simpler than codecs.getencoder(...).encode/decode would be nice.

[0]: http://docs.python.org/3/library/codecs.html

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Nick Coghlan

Nick Coghlan added the comment:

It turns out MAL added the convenience API I'm looking for back in 2004, it 
just didn't get documented, and is hidden behind the from _codecs import * 
call in the codecs.py source code:

http://hg.python.org/cpython-fullhistory/rev/8ea2cb1ec598

So, all the way from 2.4 to 2.7 you can write:

  from codecs import encode
  result = encode(data, base64)

It works in 3.x as well, you just need to add the _codec to the end to 
account for the missing aliases:

 encode(bexample, base64_codec)
b'ZXhhbXBsZQ==\n'
 decode(bZXhhbXBsZQ==\n, base64_codec)
b'example'

Note that the convenience functions omit the extra checks that are part of the 
methods (although I admit the specific error here is rather quirky):

 bZXhhbXBsZQ==\n.decode(base64_codec)
Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib64/python3.2/encodings/base64_codec.py, line 20, in 
base64_decode
return (base64.decodebytes(input), len(input))
  File /usr/lib64/python3.2/base64.py, line 359, in decodebytes
raise TypeError(expected bytes, not %s % s.__class__.__name__)
TypeError: expected bytes, not memoryview

I'me going to create some additional issues, so this one can return to just 
being about restoring the missing aliases.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Just copying some details here about codecs.encode() and
codec.decode() from python-dev:


Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:

import codecs
r13 = codecs.encode('hello world', 'rot-13')

These interface directly to the codec interfaces, without
enforcing type restrictions. The codec defines the supported
input and output types.


As Nick found, these aren't documented, which is a documentation
bug (I probably forgot to add documentation back then).
They have been in Python since 2004:

http://hg.python.org/cpython-fullhistory/rev/8ea2cb1ec598

These API are nice for general purpose codec work and
that's why I added them back in 2004.

For the codecs in question, it would still be nice to have
a more direct way to access them via methods on the types
that you typically use them with.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Ezio Melotti

Ezio Melotti added the comment:

 It works in 3.x as well, you just need to add the _codec to the end
 to account for the missing aliases:

FTR this is because of ff1261a14573 (see #10807).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Nick Coghlan

Nick Coghlan added the comment:

Issue 17827 covers adding documentation for codecs.encode and codecs.decode

Issue 17828 covers adding exception handling improvements for all encoding and 
decoding operations

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-24 Thread Nick Coghlan

Nick Coghlan added the comment:

For me, the killer argument *against* a method based API is memoryview (and, 
equivalently, array.array). It should be possible to use those as inputs for 
the bytes-bytes codecs, and once you endorse codecs.encode and codecs.decode 
for that use case, it's hard to justify adding more exclusive methods to the 
already broad bytes and bytearray APIs (particularly given the problems with 
conveying direction of conversion unambiguously).

By contrast, I think the codecs functions are generic while the str, bytes and 
bytearray methods are specific to text encodings is something we can explain 
fairly easily, thus allowing the aliases mentioned in this issue to be restored 
for use with the codecs module functions. To avoid reintroducing the quirky 
errors described in issue 10807, the encoding and decoding error messages 
should first be improved as discussed in issue 17828.

--
dependencies: +More informative error handling when encoding and decoding

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Florent Xicluna

Florent Xicluna added the comment:

Another rant, because it matters to many of us:
http://lucumr.pocoo.org/2012/8/11/codec-confusion/

IMHO, the solution to restore str.decode and bytes.encode and return TypeError 
for improper use is probably the most obvious for the average user.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Ezio Melotti

Ezio Melotti added the comment:

-1
I see encoding as the process to go from text to bytes, and decoding the 
process to go from bytes to text, so (ab)using these terms for other kind of 
conversions is not an option IMHO.

Anyway I think someone should write a PEP and list the possible options and 
their pro and cons, and then a decision can be taken on python-dev.

FTR in Python 2 you can use decode for bytes-text, text-text, bytes-bytes, 
and even text-bytes:
u'DEADBEEF'.decode('hex')
'\xde\xad\xbe\xef'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread R. David Murray

R. David Murray added the comment:

transform/untransform has approval-in-principle, adding encode/decode to the 
type that doesn't have them has been explicitly (and repeatedly :) rejected.

(I don't know about anybody else, but at this point I have written code that 
assumes that if an object has an 'encode' method, calling it will get me a 
bytes, and vice versa with 'decode'...an assumption I know is not safe, but 
that I feel is useful duck typing in the contexts in which I used it.)

Nick wants a PEP, other people have said a PEP isn't necessary.  What is 
certainly necessary is for someone to pick up the ball and run with it.

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Florent Xicluna

Florent Xicluna added the comment:

I am not a native english speaker, but it seems that the common usage of 
encode/decode is wider than the restricted definition applied for Python 3.3:

Some examples:

* RFC 4648 specifies Base16, Base32, and Base64 Data Encodings
  http://tools.ietf.org/html/rfc4648

* About rot13: the same code can be used for encoding and decoding
  http://www.catb.org/~esr/jargon/html/R/rot13.html

* The Huffman coding is an entropy encoding algorithm (used for DEFLATE)
  http://en.wikipedia.org/wiki/Huffman_coding

* RFC 2616 lists (zlib's) deflate or gzip as encoding transformations
  http://tools.ietf.org/html/rfc2616#section-3.5


However, I acknowledge that there are valid reasons to choose a different verb 
too.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Ezio Melotti

Ezio Melotti added the comment:

While not strictly necessary, a PEP would be certainly useful and will help 
reaching a consensus.  The PEP should provide a summary of the available 
options (transform/untransforms, reintroducing encode/decode for bytes/str, 
maybe others), their intended behavior (e.g. is type(x.transform()) == type(x) 
always true?), and possible issues (e.g.  Should some transformations be 
limited to str or bytes?  Should rot13 work with both transform and 
untransform?).
Even if we all agreed on a solution, such document would still be useful IMHO.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Nick Coghlan

Nick Coghlan added the comment:

+1 for someone stepping up to write a PEP on this if they would like to see the 
situation improved in 3.4.

transform/untransform has at least one core developer with an explicit -1 on 
the proposal at the moment (me).

We *definitely* need a generic object-object convenience API in the codecs 
module (codecs.decode, codecs.encode). I even accept that those two functions 
could be worthy of elevation to be new builtin functions.

I'm *far* from convinced that awkwardly named methods that only handle 
str-object, bytes-object and bytearray-object are a good idea. Should 
memoryview gain transform/untransform methods as well?

transform/untransform as proposed aren't even inverse operations, since they 
don't swap the valid input and output types (that is, transform is 
str/bytes/bytearray to arbitrary objects, while untransform is *also* 
str/bytes/bytearray to arbitrary objects. Inverses can't have a domain/range 
mismatch like that).

Those names are also ambiguous about which one corresponds to encoding and 
which to decoding. encode() and decode(), whether as functions in the codecs 
module or as builtins, have no such issue.

Personally, the more I think about it, the more I'm in favour of adding encode 
and decode as builtin functions for 3.4. If you want arbitrary object-object 
conversions, use the builtins, if you want strict str-bytes or 
bytes/bytearray-str use the methods. Python 3 has been around long enough now, 
and Python 3.2 and 3.3 are sufficiently well known that I think we can add the 
full power builtins without people getting confused.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread R. David Murray

R. David Murray added the comment:

I was visualizing transform/untransform as being restricted to 
buffertype-bytes and stringtype-string, which at least for binascii-type 
transforms is all the modules support.  After all, you don't get to choose what 
type of object you get back from encode or decode.

A more generalized transformation (encode/decode) utility is also interesting, 
but how many non-string non-bytes transformations do we actually support?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Nick Coghlan

Nick Coghlan added the comment:

If transform is a method, how do you plan to accept arbitrary buffer supporting 
types as input?

This is why I mentioned memoryview: it doesn't provide decode(), but there's no 
good reason you should have to copy the data from the view before decoding it. 
Similarly, you shouldn't have to make an unaltered copy before creating a 
compressed (or decompressed) copy.

With codecs.encode and codecs.decode as functions, supporting memoryview as an 
input for bytes-str decoding, binary-bytes encoding (e.g. gzip compression) 
and binary-bytes decoding (e.g. gzip decompression) is trivial. Ditto for 
array.array and anything else that supports the buffer protocol.

With transform/untransform as methods? No such luck.

And once you're using functions rather than methods, it's best to define the 
API as object - object, and leave any type constraints up to the individual 
codecs (with the error handling improved to provide more context and a more 
meaningful exception type, as I described earlier in the thread)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread R. David Murray

R. David Murray added the comment:

I agree with you.  transform/untransform are parallel to encode/decode, and I 
wouldn't expect them to exist on any type that didn't support either encode or 
decode.  They are convenience methods, just as encode/decode are.

I am also probably not invested enough in it to write the PEP :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Guido van Rossum

Guido van Rossum added the comment:

str.decode() and bytes.encode() are not coming back.

Any proposal had better take into account the API design rule that the *type* 
of a method's return value should not depend on the *value* of one of the 
arguments.  (The Python 2 design failed this test, and that's why we changed 
it.)

It is however fine to let the return type depend on one of the argument 
*types*.  So e.g. bytes.transform(enc) - bytes and str.transform(enc) - str 
are fine.  And so are e.g. transform(bytes, enc) - bytes and transform(str, 
enc) - str.  But a transform() taking bytes that can return either str or 
bytes depending on the encoding name would be a problem.

Personally I don't think transformations are so important or ubiquitous so as 
to deserve being made new bytes/str methods.  I'd be happy with a convenience 
function, for example transform(input, codecname), that would have to be 
imported from somewhere (maybe the codecs module).

My guess is that in almost all cases where people are demanding to say e.g.

  x = y.transform('rot13')

the codec name is a fixed literal, and they are really after minimizing the 
number of imports.  Personally, disregarding the extra import line, I think

  x = rot13.transform(y)

looks better though.  Such custom APIs also give the API designer (of the 
transformation) more freedom to take additional optional parameters affecting 
the transformation, offer a set of variants, or a richer API.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Georg Brandl

Georg Brandl added the comment:

FWIW, I'm not interested in seeing this added anymore.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Gregory P. Smith

Gregory P. Smith added the comment:

consensus here appears to be bad idea... don't do this.

--
nosy: +gregory.p.smith
priority: high - normal
resolution:  - wont fix
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Nick Coghlan

Nick Coghlan added the comment:

No, transform/untransform as methods are a bad idea, but these *codecs*
should definitely come back.

The minimal change needed for that to be feasible is to give errors raised
during encoding and decoding more context information (at least the codec
name and error mode, and switching to the right kind of error).

MAL also stated on python-dev that codecs.encode and codecs.decode already
exist, so it should just be a matter of documenting them properly.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Gregory P. Smith

Gregory P. Smith added the comment:

okay, but i don't personally find any of these to be good ideas as codecs 
given they don't have anything to do with translating between bytes-unicode.

--
resolution: wont fix - 
stage: committed/rejected - 
status: closed - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-23 Thread Nick Coghlan

Nick Coghlan added the comment:

The codecs module is generic, text encodings are just the most common use
case (hence the associated method API).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-22 Thread Phil Connell

Changes by Phil Connell pconn...@gmail.com:


--
nosy: +pconnell

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-01 Thread Florent Xicluna

Changes by Florent Xicluna florent.xicl...@gmail.com:


--
nosy:  -flox

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2013-04-01 Thread Florent Xicluna

Changes by Florent Xicluna florent.xicl...@gmail.com:


--
nosy: +flox

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-09-12 Thread Uzume

Uzume added the comment:

Many have chimed in on this topic but I thought I would lend my stance--for 
whatever it is worth.

I also believe most of these do not fit concept of a character codec and some 
sort of transforms would likely be useful, however most are sort of specialized 
(e.g., there should probably be a generalized compression library interface al 
la hashlib):

rot13: a (albeit simplistic) text cipher (str to str; though bytes to bytes 
could be argued since since many crypto functions do that)

zlib, bz2, etc. (lzma/xz should also be here): all bytes to bytes compression 
transforms

hex(adecimal) uu, base64, etc.: these more or less fit the description of a 
character codec as they map between bytes and str, however, I am not sure they 
are really the same thing as these are basically doing a radix transformation 
to character symbols and the mapping it not strictly from bytes to a single 
character and back as a true character codec seems to imply. As evidenced by by 
int() format() and bytes.fromhex(), float.hex(), float.fromhex(), etc., these 
are more generalized conversions for serializing strings of bits into a textual 
representation (possibly for human consumption).

I personally feel any type/class.hex(), etc. method would be better off as a 
format() style formatter if they are to exist in such a space at all (i.e., not 
some more generalized conversion library--which we have but since 3.x could 
probably use to be updated and cleaned up).

--
nosy: +uzume

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-09-12 Thread Uzume

Changes by Uzume uz...@users.sourceforge.net:


--
nosy:  -uzume

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-08-25 Thread Nick Coghlan

Changes by Nick Coghlan ncogh...@gmail.com:


--
priority: release blocker - high

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-07-14 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

FWIW it's, I've been thinking further about this recently and I think 
implementing this feature as builtin methods is the wrong way to approach it.

Instead, I propose the addition of codecs.encode and codecs.decode methods that 
are type neutral (leaving any type checks entirely up to the codecs 
themselves), while the str.encode and bytes.decode methods retain their current 
strict test model related type restrictions.

Also, I now think my previous proposal for nice error messages was massively 
over-engineered. A much simpler approach is to just replace the status quo:

 .encode(bz2_codec)
Traceback (most recent call last):
  File stdin, line 1, in module
  File /home/ncoghlan/devel/py3k/Lib/encodings/bz2_codec.py, line 17, in 
bz2_encode
return (bz2.compress(input), len(input))
  File /home/ncoghlan/devel/py3k/Lib/bz2.py, line 443, in compress
return comp.compress(data) + comp.flush()
TypeError: 'str' does not support the buffer interface

with a better error with more context like:

UnicodeEncodeError: encoding='bz2_codec', errors='strict', 
codec_error=TypeError: 'str' does not support the buffer interface

A similar change would be straightforward on the decoding side.

This would be a good use case for __cause__, but the codec error should still 
be included in the string representation.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-07-14 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-06-28 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

My current opinion is that this should be a PEP for 3.4, to make sure we flush 
out all the corner cases and other details correctly.

--
versions: +Python 3.4 -Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-06-28 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

For that matter, with the relevant codecs restored in 3.2, a transform() helper 
could probably be added to six (or a new project on PyPI) to prototype the 
approach.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-06-28 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Setting as a release blocker for 3.4 - this is important.

--
priority: normal - release blocker
stage: commit review - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-02-18 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-02-13 Thread Barry A. Warsaw

Changes by Barry A. Warsaw ba...@python.org:


--
nosy: +barry

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-02-13 Thread STINNER Victor

STINNER Victor victor.stin...@gmail.com added the comment:

What is the status of this issue? Is there still a fan of this issue motivated 
to write a PEP, a patch or something like that?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2012-02-13 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

It's still on my radar to come back and have a look at it. Feedback from the 
web folks doing Python 3 migrations is that it would have helped them in quite 
a few cases.

I want to get a couple of other open PEPs out of the way first, though (mainly 
394 and 409)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-12-14 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

Issue 13600 has been marked as a duplicate of this issue.

FRT, +1 to the idea of adding encoded_format and decoded_format attributes to 
CodecInfo, and also to adding {str,bytes}.{transform,untransform} back.

--
nosy: +petri.lehtinen

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

They were removed because adding new methods to builtin types violated the 
language moratorium.

Now that the language moratorium is over, the transform/untransform convenience 
APIs should be added again for 3.3. It's an approved change, the original 
timing was just wrong.

--
assignee: lemburg - 
nosy: +ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Sorry, I meant to state my rationale for the unassignment - I'm assuming this 
issue is covered by MAL's recent decision to step away from Unicode and codec 
maintenance issues. If that's incorrect, MAL can reclaim the issue, otherwise 
unassigning leaves it open for whoever wants to move it forward.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Some further comments after getting back up to speed with the actual status of 
this problem (i.e. that we had issues with the error checking and reporting in 
the original 3.2 commit).

1. I agree with the position that the codecs module itself is intended to be a 
type neutral codec registry. It encodes and decodes things, but shouldn't 
actually care about the types involved. If that is currently not the case in 
3.x, it needs to be fixed.

This type neutrality was blurred in 2.x by the fact that it only implemented 
str-str translations, and even further obscured by the coupling to the 
.encode() and .decode() convenience APIs. The fact that the type neutrality of 
the registry itself is currently broken in 3.x is a *regression*, not an 
improvement. (The convenience APIs, on the other hand, are definitely *not* 
type neutral, and aren't intended to be)

2. To assist in producing nice error messages, and to allow restrictions to be 
enforced on type-specific convenience APIs, the CodecInfo objects should grow 
additional state as MAL suggests. To avoid redundancy (and inaccurate 
overspecification), my suggested colour for that particular bikeshed is:

Character encoding codec:
  .decoded_format = 'text'
  .encoded_format = 'binary'

Binary transform codec:
  .decoded_format = 'binary'
  .encoded_format = 'binary'

Text transform codec:
  .decoded_format = 'text'
  .encoded_format = 'text'

I suggest using the fuzzy format labels mainly due to the existence of the 
buffer API - most codec operations that consume binary data will accept 
anything that implements the buffer API, so referring specifically to 'bytes' 
in error messages would be inaccurate.

The convenience APIs can then emit errors like:

  'a'.encode('rot_13') ==
  CodecLookupError: text - binary codec expected ('rot_13' is text - text)

  'a'.decode('rot_13') ==
  CodecLookupError: text - binary codec expected ('rot_13' is text - text)

  'a'.transform('bz2') ==
  CodecLookupError: text - text codec expected ('bz2' is binary - binary)

  'a'.transform('ascii') ==
  CodecLookupError: text - text codec expected ('ascii' is text - binary)

  b'a'.transform('ascii') ==
  CodecLookupError: binary - binary codec expected ('ascii' is text - 
binary)

For backwards compatibility with 3.2, codecs that do not specify their formats 
should be treated as character encoding codecs (i.e. decoded format is 'text', 
encoded format is 'binary')

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Oops, typo in my second error example. The command should be:

  b'a'.decode('rot_13')

(Since str objects don't offer a decode() method any more)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 *.encode('rot_13') == CodecLookupError

I like the idea of raising a lookup error on .encode/.decode if the codec is 
not a classic text codec (like ASCII or UTF-8).

 *.transform('ascii') == CodecLookupError

Same comment.

 str.transform('bz2') == CodecLookupError

A lookup error is surprising here. It may be a TypeError instead. The bz2 can 
be used with .transform, but not on str. So:

 - Lookup error if the codec cannot be used with encode/decode or 
transform/untransform
 - Type error if the value type is invalid

(CodecLookupError doesn't exist, you propose to define a new exception who 
inherits from LookupError?)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

On Thu, Oct 20, 2011 at 8:34 AM, STINNER Victor rep...@bugs.python.org wrote:
 str.transform('bz2') == CodecLookupError

 A lookup error is surprising here. It may be a TypeError instead. The bz2 can 
 be used with .transform, but not on str. So:

No, it's the same concept as the other cases - we found a codec with
the requested name, but it's not the kind of codec we wanted in the
current context (i.e. str.transform). It may be that the problem is
the user has a str when they expected to have a bytearray or a bytes
object, but there's no way for the codec lookup process to know that.

  - Lookup error if the codec cannot be used with encode/decode or 
 transform/untransform
  - Type error if the value type is invalid

There's no way for str.transform to tell the difference between I
asked for the wrong codec and I expected to have a bytes object
here, not a str object. That's why I think we need to think in terms
of format checks rather than type checks.

 (CodecLookupError doesn't exist, you propose to define a new exception who 
 inherits from LookupError?)

Yeah, and I'd get that to handle the process of creating the nice
error messages. I think it may even make sense to build the filtering
options into codecs.lookup() itself:

  def lookup(encoding, decoded_format=None,  encoded_format=None):
  info = _lookup(encoding) # The existing codec lookup algorithm
  if ((decoded_format is not None and decoded_format !=
info.decoded_format) or
  (encoded_format is not None and encoded_format !=
info.encoded_format)):
  raise CodecLookupError(info, decoded_format, encoded_format)

Then the various encode, decode and transform methods can just pass
the appropriate arguments to 'codecs.lookup' without all having to
reimplement the format checking logic.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 I think it may even make sense to build the filtering
 options into codecs.lookup() itself:
 
   def lookup(encoding, decoded_format=None,  encoded_format=None):
   info = _lookup(encoding) # The existing codec lookup algorithm
   if ((decoded_format is not None and decoded_format !=
 info.decoded_format) or
   (encoded_format is not None and encoded_format !=
 info.encoded_format)):
   raise CodecLookupError(info, decoded_format, encoded_format)

lookup('rot13') should fail with a lookup error to keep backward 
compatibility. You can just change the default values to:

def lookup(encoding, decoded_format='text',  encoded_format='binary'): ...

If you patch lookup, what about the following functions?

- getencoder()
- getdecoder()
- getincrementalencoder()
- getincrementaldecoder()
- getread()
- getwriter()
- itereencode()

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-19 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

I'm fine with people needing to drop down to the lower level lookup() API if 
they want the filtering functionality in Python code. For most purposes, 
constraining the expected codec input and output formats really isn't a major 
issue - we just need it in the core in order to emit sane error messages when 
people misuse the convenience APIs based on things that used to work in 2.x 
(like 'a'.encode('base64')).

At the C level, I'd adjust _PyCodec_Lookup to accept the two extra arguments 
and add _PyCodec_EncodeText, _PyCodec_DecodeBinary, _PyCodec_TransformText and 
_PyCodec_TransformBinary to support the convenience APIs (rather than needing 
the individual objects to know about the details of the codec tagging 
mechanism).

Making new codecs available isn't a backwards compatibility problem - anyone 
relying on a particular key being absent from an extensible registry is clearly 
doing the wrong thing.

Regarding the particular formats, I'd suggest that hex, base64, quopri, uu, bz2 
and zlib all be flagged as binary transforms, but rot13 be implemented as a 
text transform (Florent's patch has rot13 as another binary transform, but it 
makes more sense in the text domain - this should just be a matter of adjusting 
some of the data types in the implementation from bytes to str)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-17 Thread Éric Araujo

Éric Araujo mer...@netwok.org added the comment:

 transform() and untransform() methods were also removed, I don't remember 
 why/how exactly,
I don’t remember either; maybe it was too late in the release process, or we 
lacked enough consensus.

 So we have rot13  friends in Python 3.2 and 3.3, but they cannot be used 
 with the regular
 str.encode('rot13'), you have to write (for example): 
 codecs.getdecoder('rot_13')
Ah, great, I thought they were not available at all!

 The major issue with {bytes,str}.(un)transform() is that we have only one 
 registry for all
 codecs, and the registry was changed in Python 3 [...] To implement 
 str.transform(), we need
 another register. Marc-Andre suggested (msg96374) to add tags to codecs
I’m confused: does the tags idea replace the idea of adding another registry?

 I'm still opposed to str-str (rot13) and bytes-bytes (hex, gzip, ...) 
 operations using the
 codecs API. Developers have to use the right module.
Well, here I disagree with you and agree with MAL: str.encode and bytes.decode 
are strict, but the codec API in general is not restricted to str→bytes and 
bytes→str directions.  Using the zlib or base64 modules vs. the codecs is a 
matter of style; sometimes you think it looks hacky, sometimes you think it’s 
very handy.  And rot13 only exists as a codec!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-16 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

What is the status of this issue?

rot13 codecs  friends were added back to Python 3.2 with 
{bytes,str}.(un)transform() methods: commit 7e4833764c88. Codecs were disabled 
because of surprising error messages before the release of Python 3.2 final: 
issue #10807, commit ff1261a14573. transform() and untransform() methods were 
also removed, I don't remember why/how exactly, maybe because new codecs were 
disabled.

So we have rot13  friends in Python 3.2 and 3.3, but they cannot be used with 
the regular str.encode('rot13'), you have to write (for example):

 codecs.getdecoder('rot_13')('rot13')
('ebg13', 5)
 codecs.getencoder('rot_13')('ebg13')
('rot13', 5)

The major issue with {bytes,str}.(un)transform() is that we have only one 
registry for all codecs, and the registry was changed in Python 3 to ensure:
 * encode: str-bytes
 * decode: bytes-str

To implement str.transform(), we need another register. Marc-Andre suggested 
(msg96374) to add tags to codecs:

.encode_input_types = (str,)
.encode_output_types = (bytes,)
.decode_input_types = (bytes,)
.decode_output_types = (str,)


I'm still opposed to str-str (rot13) and bytes-bytes (hex, gzip, ...) 
operations using the codecs API. Developers have to use the right module. If 
the API of these modules is too complex, we should add helpers to these 
modules, but not to builtin types. Builtin types have to be and stay simple and 
well defined.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-10-09 Thread Éric Araujo

Éric Araujo mer...@netwok.org added the comment:

So.  This was reverted before 3.2 was out, right?  What is the status for 3.3?

--
components:  -2to3 (2.x to 3.0 conversion tool), Documentation

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-09-22 Thread Cherniavsky Beni

Changes by Cherniavsky Beni b...@google.com:


--
nosy: +cben

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-07-19 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
versions: +Python 3.3 -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2011-01-02 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

See issue #10807: 'base64' can be used with bytes.decode() (and str.encode()), 
but it raises a confusing exception (TypeError: expected bytes, not memoryview).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-09 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

With Georg's approval, I am reopening this issue until a decision is made on 
whether {str,bytes,bytearray}.{transform,untransform} methods should go into 
3.2.

I am adding Guido to nosy because the decision may turn on the interpretation 
of his post. [1]

I also started a python-dev thread on this issue. [2]

[1] http://mail.python.org/pipermail/python-dev/2010-December/106374.html
[2] http://mail.python.org/pipermail/python-dev/2010-December/106617.html

--
components: +Unicode
nosy: +gvanrossum
resolution: fixed - 
stage:  - commit review
status: closed - open
type:  - feature request

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Martin v. Löwis wrote:
 
 Martin v. Löwis mar...@v.loewis.de added the comment:
 
 As per 
 
 http://mail.python.org/pipermail/python-dev/2010-December/106374.html
 
 I think this checkin should be reverted, as it's breaking the language 
 moratorium.

I've asked Guido. We may have to revert the addition of the new
methods and then readd them for 3.3, but I don't really see
them as difficult to implement for the other Python implementations,
since they are just interfaces to the codec sub-system.

The readdition of the codecs and changes to support them in the
codec system do not fall under the moratorium, since they are
stdlib changes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-05 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

As per 

http://mail.python.org/pipermail/python-dev/2010-December/106374.html

I think this checkin should be reverted, as it's breaking the language 
moratorium.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-05 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

I leave this to MAL, on whose behalf I finished this to be in time for beta.

--
assignee:  - lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-03 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Alexander Belopolsky wrote:
 
 Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
 
 I am probably a bit late to this discussion, but why these things should be 
 called codecs and why should they share the registry with the encodings?  
 It looks like the proper term would be transformations or transforms.

.transform() is just the name of the method. The codecs are still just
that: codecs, i.e. objects that encode and decode data. The types they
support are defined by the codecs, not by the helper methods.

In Python3, the str and bytes methods .encode() and .decode() will
only support str-bytes-str conversions. The new
str and bytes .transform() method adds back str-str and
bytes-bytes.

The codec subsystem does not impose restrictions on the type combinations
a codec can support, and that's per design.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-02 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

Codecs brought back and (un)transform implemented in r86934.

--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-12-02 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

I am probably a bit late to this discussion, but why these things should be 
called codecs and why should they share the registry with the encodings?  It 
looks like the proper term would be transformations or transforms.

--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 I would like to know what happened with hex_codec and what is the new py3 for 
 this.

If you had read this bug report, you'd know that the codec was removed
in Python 3. Use binascii.hexlify/binascii.unhexlify instead (as you
should in 2.x, also).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Martin v. Löwis wrote:
 
 Martin v. Löwis mar...@v.loewis.de added the comment:
 
 I would like to know what happened with hex_codec and what is the new py3 
 for this.
 
 If you had read this bug report, you'd know that the codec was removed
 in Python 3. Use binascii.hexlify/binascii.unhexlify instead (as you
 should in 2.x, also).

... or wait for Python 3.2 which will readd them :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

... but don't wait too long!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Georg Brandl

Changes by Georg Brandl ge...@python.org:


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

... but don't wait to long to add them!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Georg Brandl wrote:
 
 Georg Brandl ge...@python.org added the comment:
 
 ... but don't wait to long to add them!

I plan to work on that after EuroPython. Florent already provided
the patch for the codecs, so what's left is adding the .transform()/
.untransform() methods, and perhaps tweak the codec input/output
types in a couple of cases.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Marc-Andre Lemburg

Changes by Marc-Andre Lemburg m...@egenix.com:


--
versions:  -Python 2.7, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Éric Araujo

Éric Araujo mer...@netwok.org added the comment:

I am confused by MvL’s reply. From the first paragraph documentation for 
binascii: “Normally, you will not use these functions directly but use wrapper 
modules like uu, base64, or binhex instead. The binascii module contains 
low-level functions written in C for greater speed that are used by the 
higher-level modules.”

Is the doc not accurate?

Also, can someone not unsure about the status of this report edit the type, 
stage, component and resolution? It would be helpful.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-07-10 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 I am confused by MvL’s reply. From the first paragraph documentation
 for binascii: “Normally, you will not use these functions directly
 but use wrapper modules like uu, base64, or binhex instead. The
 binascii module contains low-level functions written in C for greater
 speed that are used by the higher-level modules.”
 
 Is the doc not accurate?

It is correct. So use base64.b16encode/b16decode then.
It's just that I personally prefer hexlify/unhexlify, because I can
memorize the function name better.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...

2010-06-14 Thread sorin

sorin sorin.sbar...@gmail.com added the comment:

I would like to know what happened with hex_codec and what is the new py3 for 
this.

Also, it would be really helpful to see DeprecationWarnings for all these 
codecs in py2x and include a note in py3 changelist. 

The official python documentation from 
http://docs.python.org/library/codecs.html lists them as valid without any 
signs of them as being dropped or replaced.

--
nosy: +sorin
title: codecs missing: base64 bz2 hex zlib ... - codecs missing: base64 bz2 
hex zlib hex_codec ...

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com