Aahz wrote: > On Sat, Feb 18, 2006, Ron Adam wrote: >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well as being used for many other >> translations to byte data. >> >> I think I would prefer that encode and decode be just functions with >> well defined names and arguments instead of being methods or arguments >> to string and Unicode types. >> >> I'm not sure on exactly how this would work. Maybe it would need two >> sets of encodings, ie.. decoders, and encoders. An exception would be >> given if it wasn't found for the direction one was going in. > > Here's an idea I don't think I've seen before: > > bytes.recode(b, src_encoding, dest_encoding) > > This requires the user to state up-front what the source encoding is. > One of the big problems that I see with the whole encoding mess is that > so much of it contains implicit assumptions about the source encoding; > this gets away from that.
Yes, but it's not just the encodings that are implicit, it is also the types. s.encode(codec) # explicit source type, ? dest type s.decode(codec) # explicit source type, ? dest type encodings.tostr(obj, codec) # implicit *known* source type # explicit dest type encodings.tounicode(obj, codec) # implicit *known* source type # explicit dest type In this case the source is implicit, but there can be a well defined check to validate the source type against the codec being used. It's my feeling the user *knows* what he already has, and so it's more important that the resulting object type is explicit. In your suggestion... bytes.recode(b, src_encoding, dest_incoding) Here the encodings are both explicit, but the both the source and the destinations of the bytes are not. Since it working on bytes, they could have come from anywhere, and after the translation they would then will be cast to the type the user *thinks* it should result in. A source of errors that would likely pass silently. The way I see it is the bytes type should be a lower level object that doesn't care what byte transformation it does. Ie.. they are all one way byte to byte transformations determined by context. And it should have the capability to read from and write to types without translating in the same step. Keep it simple. Then it could be used as a lower level byte translator to implement encodings and other translations in encoding methods or functions instead of trying to make it replace the higher level functionality. Cheers, Ron _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com