On Jun 21, 2010, at 10:20 PM, Nick Coghlan wrote: >Something that may make sense to ease the porting process is for some >of these "on the boundary" I/O related string manipulation functions >(such as os.path.join) to grow "encoding" keyword-only arguments. The >recommended approach would be to provide all strings, but bytes could >also be accepted if an encoding was specified. (If you want to mix >encodings - tough, do the decoding yourself).
This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz for it. Would it make sense to have "encoding-carrying" bytes and str types? Basically, I'm thinking of types (maybe even the current ones) that carry around a .encoding attribute so that they can be automatically encoded and decoded where necessary. This at least would simplify APIs that need to do the conversion. By default, the .encoding attribute would be some marker to indicated "I have no idea, do it explicitly" and if you combine ebytes or estrs that have incompatible encodings, you'd either throw an exception or reset the .encoding to IAmConfuzzled. But say you had an email header like: =?euc-jp?b?pc+l7aG8pe+hvKXrpcmhqg==?= And code like the following (made less crappy): -----snip snip----- class ebytes(bytes): encoding = 'ascii' def __str__(self): s = estr(self.decode(self.encoding)) s.encoding = self.encoding return s class estr(str): encoding = 'ascii' s = str(b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa', 'euc-jp') b = bytes(s, 'euc-jp') eb = ebytes(b) eb.encoding = 'euc-jp' es = str(eb) print(repr(eb), es, es.encoding) -----snip snip----- Running this you get: b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa' ハローワールド! euc-jp Would it be feasible? Dunno. Would it help ease the bytes/str confusion? Dunno. But I think it would help make APIs easier to design and use because it would cut down on the encoding-keyword function signature infection. -Barry
signature.asc
Description: PGP signature
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com