[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Roundup Robot added the comment: New changeset d7950e916f20 by R David Murray in branch '3.3': #7475: Remove references to '.transform' from transform codec docstrings. http://hg.python.org/cpython/rev/d7950e916f20 New changeset 83d54ab5c696 by R David Murray in branch 'default': Merge #7475: Remove references to '.transform' from transform codec docstrings. http://hg.python.org/cpython/rev/83d54ab5c696 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Serhiy Storchaka added the comment: Docstrings for new codecs mention bytes.transform() and bytes.untransform() which are nonexistent. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Jakub Wilk jw...@jwilk.net: -- nosy: +jwilk ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: The 3.4 portion of issue 19619 has been addressed, so removing it as a dependency again. -- dependencies: -Blacklist base64, hex, ... codecs from bytes.decode() and str.encode() ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: With issue 19619 resolved for Python 3.4 (the issue itself remains open awaiting a backport to 3.3), Victor has softened his stance on this topic and given the go ahead to restore the codec aliases: http://bugs.python.org/issue19619#msg203897 I'll be committing this shortly, after adjusting the patch to account for the issue 19619 changes to the tests and What's New. -- assignee: - ncoghlan versions: +Python 3.4 -Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Roundup Robot added the comment: New changeset 5e960d2c2156 by Nick Coghlan in branch 'default': Close #7475: Restore binary text transform codecs http://hg.python.org/cpython/rev/5e960d2c2156 -- nosy: +python-dev resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Note that I still plan to do a documentation-only PEP for 3.4, proposing some adjustments to the way the codecs module is documented, making binary and test transform defined terms in the glossary, etc. I'll probably aim for beta 2 for that. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Nick Coghlan ncogh...@gmail.com: -- dependencies: +Blacklist base64, hex, ... codecs from bytes.decode() and str.encode() ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Victor is still -1, so to Python 3.5 it goes. -- versions: +Python 3.5 -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Attached patch restores the aliases for the binary and text transforms, adds a test to ensure they exist and restores the Aliases column to the relevant tables in the documentation. It also updates the relevant section in the What's New document. I also tweaked the wording in the docs to use the phrases binary transform and text transform for the affected tables and version added/changed notices. Given the discussions on python-dev, the main condition that needs to be met before I commit this is for Victor to change his current -1 to a -0 or higher. -- Added file: http://bugs.python.org/file32663/issue7475_restore_codec_aliases_in_py34.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Issue 17823 is now closed, but not because it has been implemented. It turns out that the data driven nature of the incompatibility means it isn't really amenable to being detected and fixed automatically via 2to3. Issue 19543 is a replacement proposal for the introduction of some additional codec related Py3k warnings in Python 2.7.7. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Providing the 2to3 fixers in issue 17823 now depends on this issue rather than the other way around (since not having to translate the names simplifies the fixer a bit). -- dependencies: -2to3 fixers for missing codecs ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: For anyone interested, I have a patch up on issue 17828 that produces the following output for various codec usage errors: import codecs codecs.encode(bhello, bz2_codec).decode(bz2_codec) Traceback (most recent call last): File stdin, line 1, in module TypeError: 'bz2_codec' decoder returned 'bytes' instead of 'str'; use codecs.decode to decode to arbitrary types hello.encode(bz2_codec) TypeError: 'str' does not support the buffer interface The above exception was the direct cause of the following exception: Traceback (most recent call last): File stdin, line 1, in module TypeError: invalid input type for 'bz2_codec' codec (TypeError: 'str' does not support the buffer interface) hello.encode(rot_13) Traceback (most recent call last): File stdin, line 1, in module TypeError: 'rot_13' encoder returned 'str' instead of 'bytes'; use codecs.encode to encode to arbitrary types -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: With issue 17839 fixed, the error from invoking the base64 codec through the method API is now substantially more sensible: bZXhhbXBsZQ==\n.decode(base64_codec) Traceback (most recent call last): File stdin, line 1, in module TypeError: decoder did not return a str object (type=bytes) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: I just wanted to note something I realised in chatting to Armin Ronacher recently: in both Python 2.x and 3.x, the encode/decode method APIs are constrained by the text model, it's just that in 2.x that model was effectively basestring-basestring, and thus still covered every codec in the standard library. This greatly limited the use cases for the codecs.encode/decode convenience functions, which is why the fact they were undocumented went unnoticed. In 3.x, the changed text model meant the method API become limited to the Unicode codecs, making the function based API more important. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: We should fix the docs for the earlier versions as well. -- versions: +Python 2.7, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Nick Coghlan ncogh...@gmail.com: -- Removed message: http://bugs.python.org/msg198847 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Nick Coghlan ncogh...@gmail.com: -- versions: -Python 2.7, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Martin Morrison m...@ensoft.co.uk: -- nosy: +isoschiz ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Also adding 17839 as a dependency, since part of the reason the base64 errors in particular are so cryptic is because the base64 module doesn't accept arbitrary PEP 3118 compliant objects as input. -- dependencies: +base64 module should use memoryview ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Serhiy Storchaka storch...@gmail.com: -- dependencies: +2to3 fixers for missing codecs ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: I also created issue 17841 to cover that that the 3.3 documentation incorrectly states that these aliases still exist, even though they were removed before 3.2 was released. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Serhiy Storchaka storch...@gmail.com: -- dependencies: +Add link to alternatives for bytes-to-bytes codecs ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Guido van Rossum gu...@python.org: -- nosy: -gvanrossum ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Martin v. Löwis added the comment: I don't see any point in merely bringing the codecs back, without any convenience API to use them. If I need to do import codecs result = codecs.getencoder(base64).encode(data) I don't think people would actually prefer this over import base64 result = base64.encodebytes(data) I't (IMO) only the convenience method (.encode) that made people love these codecs. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Ezio Melotti added the comment: IMHO it's also a documentation problem. Once people figure out that they can't use encode/decode anymore, it's not immediately clear what they should do instead. By reading the codecs docs[0] it's not obvious that it can be done with codecs.getencoder(...).encode/decode, so people waste time finding a solution, get annoyed, and blame Python 3 because it removed a simple way to use these codecs without making clear what should be used instead. FWIW I don't care about having to do an extra import, but indeed something simpler than codecs.getencoder(...).encode/decode would be nice. [0]: http://docs.python.org/3/library/codecs.html -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: It turns out MAL added the convenience API I'm looking for back in 2004, it just didn't get documented, and is hidden behind the from _codecs import * call in the codecs.py source code: http://hg.python.org/cpython-fullhistory/rev/8ea2cb1ec598 So, all the way from 2.4 to 2.7 you can write: from codecs import encode result = encode(data, base64) It works in 3.x as well, you just need to add the _codec to the end to account for the missing aliases: encode(bexample, base64_codec) b'ZXhhbXBsZQ==\n' decode(bZXhhbXBsZQ==\n, base64_codec) b'example' Note that the convenience functions omit the extra checks that are part of the methods (although I admit the specific error here is rather quirky): bZXhhbXBsZQ==\n.decode(base64_codec) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib64/python3.2/encodings/base64_codec.py, line 20, in base64_decode return (base64.decodebytes(input), len(input)) File /usr/lib64/python3.2/base64.py, line 359, in decodebytes raise TypeError(expected bytes, not %s % s.__class__.__name__) TypeError: expected bytes, not memoryview I'me going to create some additional issues, so this one can return to just being about restoring the missing aliases. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Marc-Andre Lemburg added the comment: Just copying some details here about codecs.encode() and codec.decode() from python-dev: Just as reminder: we have the general purpose encode()/decode() functions in the codecs module: import codecs r13 = codecs.encode('hello world', 'rot-13') These interface directly to the codec interfaces, without enforcing type restrictions. The codec defines the supported input and output types. As Nick found, these aren't documented, which is a documentation bug (I probably forgot to add documentation back then). They have been in Python since 2004: http://hg.python.org/cpython-fullhistory/rev/8ea2cb1ec598 These API are nice for general purpose codec work and that's why I added them back in 2004. For the codecs in question, it would still be nice to have a more direct way to access them via methods on the types that you typically use them with. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Ezio Melotti added the comment: It works in 3.x as well, you just need to add the _codec to the end to account for the missing aliases: FTR this is because of ff1261a14573 (see #10807). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: Issue 17827 covers adding documentation for codecs.encode and codecs.decode Issue 17828 covers adding exception handling improvements for all encoding and decoding operations -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: For me, the killer argument *against* a method based API is memoryview (and, equivalently, array.array). It should be possible to use those as inputs for the bytes-bytes codecs, and once you endorse codecs.encode and codecs.decode for that use case, it's hard to justify adding more exclusive methods to the already broad bytes and bytearray APIs (particularly given the problems with conveying direction of conversion unambiguously). By contrast, I think the codecs functions are generic while the str, bytes and bytearray methods are specific to text encodings is something we can explain fairly easily, thus allowing the aliases mentioned in this issue to be restored for use with the codecs module functions. To avoid reintroducing the quirky errors described in issue 10807, the encoding and decoding error messages should first be improved as discussed in issue 17828. -- dependencies: +More informative error handling when encoding and decoding ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Florent Xicluna added the comment: Another rant, because it matters to many of us: http://lucumr.pocoo.org/2012/8/11/codec-confusion/ IMHO, the solution to restore str.decode and bytes.encode and return TypeError for improper use is probably the most obvious for the average user. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Ezio Melotti added the comment: -1 I see encoding as the process to go from text to bytes, and decoding the process to go from bytes to text, so (ab)using these terms for other kind of conversions is not an option IMHO. Anyway I think someone should write a PEP and list the possible options and their pro and cons, and then a decision can be taken on python-dev. FTR in Python 2 you can use decode for bytes-text, text-text, bytes-bytes, and even text-bytes: u'DEADBEEF'.decode('hex') '\xde\xad\xbe\xef' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
R. David Murray added the comment: transform/untransform has approval-in-principle, adding encode/decode to the type that doesn't have them has been explicitly (and repeatedly :) rejected. (I don't know about anybody else, but at this point I have written code that assumes that if an object has an 'encode' method, calling it will get me a bytes, and vice versa with 'decode'...an assumption I know is not safe, but that I feel is useful duck typing in the contexts in which I used it.) Nick wants a PEP, other people have said a PEP isn't necessary. What is certainly necessary is for someone to pick up the ball and run with it. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Florent Xicluna added the comment: I am not a native english speaker, but it seems that the common usage of encode/decode is wider than the restricted definition applied for Python 3.3: Some examples: * RFC 4648 specifies Base16, Base32, and Base64 Data Encodings http://tools.ietf.org/html/rfc4648 * About rot13: the same code can be used for encoding and decoding http://www.catb.org/~esr/jargon/html/R/rot13.html * The Huffman coding is an entropy encoding algorithm (used for DEFLATE) http://en.wikipedia.org/wiki/Huffman_coding * RFC 2616 lists (zlib's) deflate or gzip as encoding transformations http://tools.ietf.org/html/rfc2616#section-3.5 However, I acknowledge that there are valid reasons to choose a different verb too. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Ezio Melotti added the comment: While not strictly necessary, a PEP would be certainly useful and will help reaching a consensus. The PEP should provide a summary of the available options (transform/untransforms, reintroducing encode/decode for bytes/str, maybe others), their intended behavior (e.g. is type(x.transform()) == type(x) always true?), and possible issues (e.g. Should some transformations be limited to str or bytes? Should rot13 work with both transform and untransform?). Even if we all agreed on a solution, such document would still be useful IMHO. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: +1 for someone stepping up to write a PEP on this if they would like to see the situation improved in 3.4. transform/untransform has at least one core developer with an explicit -1 on the proposal at the moment (me). We *definitely* need a generic object-object convenience API in the codecs module (codecs.decode, codecs.encode). I even accept that those two functions could be worthy of elevation to be new builtin functions. I'm *far* from convinced that awkwardly named methods that only handle str-object, bytes-object and bytearray-object are a good idea. Should memoryview gain transform/untransform methods as well? transform/untransform as proposed aren't even inverse operations, since they don't swap the valid input and output types (that is, transform is str/bytes/bytearray to arbitrary objects, while untransform is *also* str/bytes/bytearray to arbitrary objects. Inverses can't have a domain/range mismatch like that). Those names are also ambiguous about which one corresponds to encoding and which to decoding. encode() and decode(), whether as functions in the codecs module or as builtins, have no such issue. Personally, the more I think about it, the more I'm in favour of adding encode and decode as builtin functions for 3.4. If you want arbitrary object-object conversions, use the builtins, if you want strict str-bytes or bytes/bytearray-str use the methods. Python 3 has been around long enough now, and Python 3.2 and 3.3 are sufficiently well known that I think we can add the full power builtins without people getting confused. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
R. David Murray added the comment: I was visualizing transform/untransform as being restricted to buffertype-bytes and stringtype-string, which at least for binascii-type transforms is all the modules support. After all, you don't get to choose what type of object you get back from encode or decode. A more generalized transformation (encode/decode) utility is also interesting, but how many non-string non-bytes transformations do we actually support? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: If transform is a method, how do you plan to accept arbitrary buffer supporting types as input? This is why I mentioned memoryview: it doesn't provide decode(), but there's no good reason you should have to copy the data from the view before decoding it. Similarly, you shouldn't have to make an unaltered copy before creating a compressed (or decompressed) copy. With codecs.encode and codecs.decode as functions, supporting memoryview as an input for bytes-str decoding, binary-bytes encoding (e.g. gzip compression) and binary-bytes decoding (e.g. gzip decompression) is trivial. Ditto for array.array and anything else that supports the buffer protocol. With transform/untransform as methods? No such luck. And once you're using functions rather than methods, it's best to define the API as object - object, and leave any type constraints up to the individual codecs (with the error handling improved to provide more context and a more meaningful exception type, as I described earlier in the thread) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
R. David Murray added the comment: I agree with you. transform/untransform are parallel to encode/decode, and I wouldn't expect them to exist on any type that didn't support either encode or decode. They are convenience methods, just as encode/decode are. I am also probably not invested enough in it to write the PEP :) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Guido van Rossum added the comment: str.decode() and bytes.encode() are not coming back. Any proposal had better take into account the API design rule that the *type* of a method's return value should not depend on the *value* of one of the arguments. (The Python 2 design failed this test, and that's why we changed it.) It is however fine to let the return type depend on one of the argument *types*. So e.g. bytes.transform(enc) - bytes and str.transform(enc) - str are fine. And so are e.g. transform(bytes, enc) - bytes and transform(str, enc) - str. But a transform() taking bytes that can return either str or bytes depending on the encoding name would be a problem. Personally I don't think transformations are so important or ubiquitous so as to deserve being made new bytes/str methods. I'd be happy with a convenience function, for example transform(input, codecname), that would have to be imported from somewhere (maybe the codecs module). My guess is that in almost all cases where people are demanding to say e.g. x = y.transform('rot13') the codec name is a fixed literal, and they are really after minimizing the number of imports. Personally, disregarding the extra import line, I think x = rot13.transform(y) looks better though. Such custom APIs also give the API designer (of the transformation) more freedom to take additional optional parameters affecting the transformation, offer a set of variants, or a richer API. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Georg Brandl added the comment: FWIW, I'm not interested in seeing this added anymore. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Gregory P. Smith added the comment: consensus here appears to be bad idea... don't do this. -- nosy: +gregory.p.smith priority: high - normal resolution: - wont fix stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: No, transform/untransform as methods are a bad idea, but these *codecs* should definitely come back. The minimal change needed for that to be feasible is to give errors raised during encoding and decoding more context information (at least the codec name and error mode, and switching to the right kind of error). MAL also stated on python-dev that codecs.encode and codecs.decode already exist, so it should just be a matter of documenting them properly. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Gregory P. Smith added the comment: okay, but i don't personally find any of these to be good ideas as codecs given they don't have anything to do with translating between bytes-unicode. -- resolution: wont fix - stage: committed/rejected - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan added the comment: The codecs module is generic, text encodings are just the most common use case (hence the associated method API). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Phil Connell pconn...@gmail.com: -- nosy: +pconnell ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Florent Xicluna florent.xicl...@gmail.com: -- nosy: -flox ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Florent Xicluna florent.xicl...@gmail.com: -- nosy: +flox ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Uzume added the comment: Many have chimed in on this topic but I thought I would lend my stance--for whatever it is worth. I also believe most of these do not fit concept of a character codec and some sort of transforms would likely be useful, however most are sort of specialized (e.g., there should probably be a generalized compression library interface al la hashlib): rot13: a (albeit simplistic) text cipher (str to str; though bytes to bytes could be argued since since many crypto functions do that) zlib, bz2, etc. (lzma/xz should also be here): all bytes to bytes compression transforms hex(adecimal) uu, base64, etc.: these more or less fit the description of a character codec as they map between bytes and str, however, I am not sure they are really the same thing as these are basically doing a radix transformation to character symbols and the mapping it not strictly from bytes to a single character and back as a true character codec seems to imply. As evidenced by by int() format() and bytes.fromhex(), float.hex(), float.fromhex(), etc., these are more generalized conversions for serializing strings of bits into a textual representation (possibly for human consumption). I personally feel any type/class.hex(), etc. method would be better off as a format() style formatter if they are to exist in such a space at all (i.e., not some more generalized conversion library--which we have but since 3.x could probably use to be updated and cleaned up). -- nosy: +uzume ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Uzume uz...@users.sourceforge.net: -- nosy: -uzume ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Nick Coghlan ncogh...@gmail.com: -- priority: release blocker - high ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: FWIW it's, I've been thinking further about this recently and I think implementing this feature as builtin methods is the wrong way to approach it. Instead, I propose the addition of codecs.encode and codecs.decode methods that are type neutral (leaving any type checks entirely up to the codecs themselves), while the str.encode and bytes.decode methods retain their current strict test model related type restrictions. Also, I now think my previous proposal for nice error messages was massively over-engineered. A much simpler approach is to just replace the status quo: .encode(bz2_codec) Traceback (most recent call last): File stdin, line 1, in module File /home/ncoghlan/devel/py3k/Lib/encodings/bz2_codec.py, line 17, in bz2_encode return (bz2.compress(input), len(input)) File /home/ncoghlan/devel/py3k/Lib/bz2.py, line 443, in compress return comp.compress(data) + comp.flush() TypeError: 'str' does not support the buffer interface with a better error with more context like: UnicodeEncodeError: encoding='bz2_codec', errors='strict', codec_error=TypeError: 'str' does not support the buffer interface A similar change would be straightforward on the decoding side. This would be a good use case for __cause__, but the codec error should still be included in the string representation. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: My current opinion is that this should be a PEP for 3.4, to make sure we flush out all the corner cases and other details correctly. -- versions: +Python 3.4 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: For that matter, with the relevant codecs restored in 3.2, a transform() helper could probably be added to six (or a new project on PyPI) to prototype the approach. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: Setting as a release blocker for 3.4 - this is important. -- priority: normal - release blocker stage: commit review - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Barry A. Warsaw ba...@python.org: -- nosy: +barry ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
STINNER Victor victor.stin...@gmail.com added the comment: What is the status of this issue? Is there still a fan of this issue motivated to write a PEP, a patch or something like that? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: It's still on my radar to come back and have a look at it. Feedback from the web folks doing Python 3 migrations is that it would have helped them in quite a few cases. I want to get a couple of other open PEPs out of the way first, though (mainly 394 and 409) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Petri Lehtinen pe...@digip.org added the comment: Issue 13600 has been marked as a duplicate of this issue. FRT, +1 to the idea of adding encoded_format and decoded_format attributes to CodecInfo, and also to adding {str,bytes}.{transform,untransform} back. -- nosy: +petri.lehtinen ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: They were removed because adding new methods to builtin types violated the language moratorium. Now that the language moratorium is over, the transform/untransform convenience APIs should be added again for 3.3. It's an approved change, the original timing was just wrong. -- assignee: lemburg - nosy: +ncoghlan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: Sorry, I meant to state my rationale for the unassignment - I'm assuming this issue is covered by MAL's recent decision to step away from Unicode and codec maintenance issues. If that's incorrect, MAL can reclaim the issue, otherwise unassigning leaves it open for whoever wants to move it forward. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: Some further comments after getting back up to speed with the actual status of this problem (i.e. that we had issues with the error checking and reporting in the original 3.2 commit). 1. I agree with the position that the codecs module itself is intended to be a type neutral codec registry. It encodes and decodes things, but shouldn't actually care about the types involved. If that is currently not the case in 3.x, it needs to be fixed. This type neutrality was blurred in 2.x by the fact that it only implemented str-str translations, and even further obscured by the coupling to the .encode() and .decode() convenience APIs. The fact that the type neutrality of the registry itself is currently broken in 3.x is a *regression*, not an improvement. (The convenience APIs, on the other hand, are definitely *not* type neutral, and aren't intended to be) 2. To assist in producing nice error messages, and to allow restrictions to be enforced on type-specific convenience APIs, the CodecInfo objects should grow additional state as MAL suggests. To avoid redundancy (and inaccurate overspecification), my suggested colour for that particular bikeshed is: Character encoding codec: .decoded_format = 'text' .encoded_format = 'binary' Binary transform codec: .decoded_format = 'binary' .encoded_format = 'binary' Text transform codec: .decoded_format = 'text' .encoded_format = 'text' I suggest using the fuzzy format labels mainly due to the existence of the buffer API - most codec operations that consume binary data will accept anything that implements the buffer API, so referring specifically to 'bytes' in error messages would be inaccurate. The convenience APIs can then emit errors like: 'a'.encode('rot_13') == CodecLookupError: text - binary codec expected ('rot_13' is text - text) 'a'.decode('rot_13') == CodecLookupError: text - binary codec expected ('rot_13' is text - text) 'a'.transform('bz2') == CodecLookupError: text - text codec expected ('bz2' is binary - binary) 'a'.transform('ascii') == CodecLookupError: text - text codec expected ('ascii' is text - binary) b'a'.transform('ascii') == CodecLookupError: binary - binary codec expected ('ascii' is text - binary) For backwards compatibility with 3.2, codecs that do not specify their formats should be treated as character encoding codecs (i.e. decoded format is 'text', encoded format is 'binary') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: Oops, typo in my second error example. The command should be: b'a'.decode('rot_13') (Since str objects don't offer a decode() method any more) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
STINNER Victor victor.stin...@haypocalc.com added the comment: *.encode('rot_13') == CodecLookupError I like the idea of raising a lookup error on .encode/.decode if the codec is not a classic text codec (like ASCII or UTF-8). *.transform('ascii') == CodecLookupError Same comment. str.transform('bz2') == CodecLookupError A lookup error is surprising here. It may be a TypeError instead. The bz2 can be used with .transform, but not on str. So: - Lookup error if the codec cannot be used with encode/decode or transform/untransform - Type error if the value type is invalid (CodecLookupError doesn't exist, you propose to define a new exception who inherits from LookupError?) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: On Thu, Oct 20, 2011 at 8:34 AM, STINNER Victor rep...@bugs.python.org wrote: str.transform('bz2') == CodecLookupError A lookup error is surprising here. It may be a TypeError instead. The bz2 can be used with .transform, but not on str. So: No, it's the same concept as the other cases - we found a codec with the requested name, but it's not the kind of codec we wanted in the current context (i.e. str.transform). It may be that the problem is the user has a str when they expected to have a bytearray or a bytes object, but there's no way for the codec lookup process to know that. - Lookup error if the codec cannot be used with encode/decode or transform/untransform - Type error if the value type is invalid There's no way for str.transform to tell the difference between I asked for the wrong codec and I expected to have a bytes object here, not a str object. That's why I think we need to think in terms of format checks rather than type checks. (CodecLookupError doesn't exist, you propose to define a new exception who inherits from LookupError?) Yeah, and I'd get that to handle the process of creating the nice error messages. I think it may even make sense to build the filtering options into codecs.lookup() itself: def lookup(encoding, decoded_format=None, encoded_format=None): info = _lookup(encoding) # The existing codec lookup algorithm if ((decoded_format is not None and decoded_format != info.decoded_format) or (encoded_format is not None and encoded_format != info.encoded_format)): raise CodecLookupError(info, decoded_format, encoded_format) Then the various encode, decode and transform methods can just pass the appropriate arguments to 'codecs.lookup' without all having to reimplement the format checking logic. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
STINNER Victor victor.stin...@haypocalc.com added the comment: I think it may even make sense to build the filtering options into codecs.lookup() itself: def lookup(encoding, decoded_format=None, encoded_format=None): info = _lookup(encoding) # The existing codec lookup algorithm if ((decoded_format is not None and decoded_format != info.decoded_format) or (encoded_format is not None and encoded_format != info.encoded_format)): raise CodecLookupError(info, decoded_format, encoded_format) lookup('rot13') should fail with a lookup error to keep backward compatibility. You can just change the default values to: def lookup(encoding, decoded_format='text', encoded_format='binary'): ... If you patch lookup, what about the following functions? - getencoder() - getdecoder() - getincrementalencoder() - getincrementaldecoder() - getread() - getwriter() - itereencode() -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Nick Coghlan ncogh...@gmail.com added the comment: I'm fine with people needing to drop down to the lower level lookup() API if they want the filtering functionality in Python code. For most purposes, constraining the expected codec input and output formats really isn't a major issue - we just need it in the core in order to emit sane error messages when people misuse the convenience APIs based on things that used to work in 2.x (like 'a'.encode('base64')). At the C level, I'd adjust _PyCodec_Lookup to accept the two extra arguments and add _PyCodec_EncodeText, _PyCodec_DecodeBinary, _PyCodec_TransformText and _PyCodec_TransformBinary to support the convenience APIs (rather than needing the individual objects to know about the details of the codec tagging mechanism). Making new codecs available isn't a backwards compatibility problem - anyone relying on a particular key being absent from an extensible registry is clearly doing the wrong thing. Regarding the particular formats, I'd suggest that hex, base64, quopri, uu, bz2 and zlib all be flagged as binary transforms, but rot13 be implemented as a text transform (Florent's patch has rot13 as another binary transform, but it makes more sense in the text domain - this should just be a matter of adjusting some of the data types in the implementation from bytes to str) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Éric Araujo mer...@netwok.org added the comment: transform() and untransform() methods were also removed, I don't remember why/how exactly, I don’t remember either; maybe it was too late in the release process, or we lacked enough consensus. So we have rot13 friends in Python 3.2 and 3.3, but they cannot be used with the regular str.encode('rot13'), you have to write (for example): codecs.getdecoder('rot_13') Ah, great, I thought they were not available at all! The major issue with {bytes,str}.(un)transform() is that we have only one registry for all codecs, and the registry was changed in Python 3 [...] To implement str.transform(), we need another register. Marc-Andre suggested (msg96374) to add tags to codecs I’m confused: does the tags idea replace the idea of adding another registry? I'm still opposed to str-str (rot13) and bytes-bytes (hex, gzip, ...) operations using the codecs API. Developers have to use the right module. Well, here I disagree with you and agree with MAL: str.encode and bytes.decode are strict, but the codec API in general is not restricted to str→bytes and bytes→str directions. Using the zlib or base64 modules vs. the codecs is a matter of style; sometimes you think it looks hacky, sometimes you think it’s very handy. And rot13 only exists as a codec! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
STINNER Victor victor.stin...@haypocalc.com added the comment: What is the status of this issue? rot13 codecs friends were added back to Python 3.2 with {bytes,str}.(un)transform() methods: commit 7e4833764c88. Codecs were disabled because of surprising error messages before the release of Python 3.2 final: issue #10807, commit ff1261a14573. transform() and untransform() methods were also removed, I don't remember why/how exactly, maybe because new codecs were disabled. So we have rot13 friends in Python 3.2 and 3.3, but they cannot be used with the regular str.encode('rot13'), you have to write (for example): codecs.getdecoder('rot_13')('rot13') ('ebg13', 5) codecs.getencoder('rot_13')('ebg13') ('rot13', 5) The major issue with {bytes,str}.(un)transform() is that we have only one registry for all codecs, and the registry was changed in Python 3 to ensure: * encode: str-bytes * decode: bytes-str To implement str.transform(), we need another register. Marc-Andre suggested (msg96374) to add tags to codecs: .encode_input_types = (str,) .encode_output_types = (bytes,) .decode_input_types = (bytes,) .decode_output_types = (str,) I'm still opposed to str-str (rot13) and bytes-bytes (hex, gzip, ...) operations using the codecs API. Developers have to use the right module. If the API of these modules is too complex, we should add helpers to these modules, but not to builtin types. Builtin types have to be and stay simple and well defined. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Éric Araujo mer...@netwok.org added the comment: So. This was reverted before 3.2 was out, right? What is the status for 3.3? -- components: -2to3 (2.x to 3.0 conversion tool), Documentation ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Cherniavsky Beni b...@google.com: -- nosy: +cben ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Éric Araujo mer...@netwok.org: -- versions: +Python 3.3 -Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
STINNER Victor victor.stin...@haypocalc.com added the comment: See issue #10807: 'base64' can be used with bytes.decode() (and str.encode()), but it raises a confusing exception (TypeError: expected bytes, not memoryview). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: With Georg's approval, I am reopening this issue until a decision is made on whether {str,bytes,bytearray}.{transform,untransform} methods should go into 3.2. I am adding Guido to nosy because the decision may turn on the interpretation of his post. [1] I also started a python-dev thread on this issue. [2] [1] http://mail.python.org/pipermail/python-dev/2010-December/106374.html [2] http://mail.python.org/pipermail/python-dev/2010-December/106617.html -- components: +Unicode nosy: +gvanrossum resolution: fixed - stage: - commit review status: closed - open type: - feature request ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: As per http://mail.python.org/pipermail/python-dev/2010-December/106374.html I think this checkin should be reverted, as it's breaking the language moratorium. I've asked Guido. We may have to revert the addition of the new methods and then readd them for 3.3, but I don't really see them as difficult to implement for the other Python implementations, since they are just interfaces to the codec sub-system. The readdition of the codecs and changes to support them in the codec system do not fall under the moratorium, since they are stdlib changes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Martin v. Löwis mar...@v.loewis.de added the comment: As per http://mail.python.org/pipermail/python-dev/2010-December/106374.html I think this checkin should be reverted, as it's breaking the language moratorium. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Georg Brandl ge...@python.org added the comment: I leave this to MAL, on whose behalf I finished this to be in time for beta. -- assignee: - lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am probably a bit late to this discussion, but why these things should be called codecs and why should they share the registry with the encodings? It looks like the proper term would be transformations or transforms. .transform() is just the name of the method. The codecs are still just that: codecs, i.e. objects that encode and decode data. The types they support are defined by the codecs, not by the helper methods. In Python3, the str and bytes methods .encode() and .decode() will only support str-bytes-str conversions. The new str and bytes .transform() method adds back str-str and bytes-bytes. The codec subsystem does not impose restrictions on the type combinations a codec can support, and that's per design. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Georg Brandl ge...@python.org added the comment: Codecs brought back and (un)transform implemented in r86934. -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: I am probably a bit late to this discussion, but why these things should be called codecs and why should they share the registry with the encodings? It looks like the proper term would be transformations or transforms. -- nosy: +belopolsky ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Martin v. Löwis mar...@v.loewis.de added the comment: I would like to know what happened with hex_codec and what is the new py3 for this. If you had read this bug report, you'd know that the codec was removed in Python 3. Use binascii.hexlify/binascii.unhexlify instead (as you should in 2.x, also). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Marc-Andre Lemburg m...@egenix.com added the comment: Martin v. Löwis wrote: Martin v. Löwis mar...@v.loewis.de added the comment: I would like to know what happened with hex_codec and what is the new py3 for this. If you had read this bug report, you'd know that the codec was removed in Python 3. Use binascii.hexlify/binascii.unhexlify instead (as you should in 2.x, also). ... or wait for Python 3.2 which will readd them :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Georg Brandl ge...@python.org added the comment: ... but don't wait too long! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Georg Brandl ge...@python.org: -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Georg Brandl ge...@python.org added the comment: ... but don't wait to long to add them! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Marc-Andre Lemburg m...@egenix.com added the comment: Georg Brandl wrote: Georg Brandl ge...@python.org added the comment: ... but don't wait to long to add them! I plan to work on that after EuroPython. Florent already provided the patch for the codecs, so what's left is adding the .transform()/ .untransform() methods, and perhaps tweak the codec input/output types in a couple of cases. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Changes by Marc-Andre Lemburg m...@egenix.com: -- versions: -Python 2.7, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Éric Araujo mer...@netwok.org added the comment: I am confused by MvL’s reply. From the first paragraph documentation for binascii: “Normally, you will not use these functions directly but use wrapper modules like uu, base64, or binhex instead. The binascii module contains low-level functions written in C for greater speed that are used by the higher-level modules.” Is the doc not accurate? Also, can someone not unsure about the status of this report edit the type, stage, component and resolution? It would be helpful. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
Martin v. Löwis mar...@v.loewis.de added the comment: I am confused by MvL’s reply. From the first paragraph documentation for binascii: “Normally, you will not use these functions directly but use wrapper modules like uu, base64, or binhex instead. The binascii module contains low-level functions written in C for greater speed that are used by the higher-level modules.” Is the doc not accurate? It is correct. So use base64.b16encode/b16decode then. It's just that I personally prefer hexlify/unhexlify, because I can memorize the function name better. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7475] codecs missing: base64 bz2 hex zlib hex_codec ...
sorin sorin.sbar...@gmail.com added the comment: I would like to know what happened with hex_codec and what is the new py3 for this. Also, it would be really helpful to see DeprecationWarnings for all these codecs in py2x and include a note in py3 changelist. The official python documentation from http://docs.python.org/library/codecs.html lists them as valid without any signs of them as being dropped or replaced. -- nosy: +sorin title: codecs missing: base64 bz2 hex zlib ... - codecs missing: base64 bz2 hex zlib hex_codec ... ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7475 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com