On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull <step...@xemacs.org> wrote: > Michael Urman writes: > > > It is somewhat troublesome that there doesn't appear to be an obvious > > built-in idempotent-when-possible function that gives back the > > provided bytes/str, > > If you want something idempotent, it's already the case that > bytes(b'abc') => b'abc'. What might be desirable is to make > bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII > (or maybe ISO 8859/1).
By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding, errors) that would pass an instance of bytes through, or encode an instance of str. And of course a to_str that performs similarly, passing str through and decoding bytes. While bytes(b'abc') will give me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me the b'abc' I want to see. These are trivial functions; I just don't fully understand why the capability isn't baked in. A one argument call is idempotent capable; a two argument call isn't as it only converts. It's not a completely made-up requirement either. A cross-platform piece of software may need to present to a user items that are sometimes str and sometimes bytes - particularly filenames. > Unfortunately, str(b'abc') already does work, but > > st...@uwakimon ~ $ python3.1 > Python 3.1.2 (release31-maint, May 12 2010, 20:15:06) > [GCC 4.3.4] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> str(b'abc') > "b'abc'" >>>> > > Oops. You can see why that probably "should" be the case Sure, and I love having this there for debugging. But this is hardly good enough for presenting to a user once you leave ascii. >>> u = '日本語' >>> sjis = bytes(u, 'shift-jis') >>> utf8 = bytes(u, 'utf-8') >>> str(sjis), str(utf8) ("b'\\x93\\xfa\\x96{\\x8c\\xea'", "b'\\xe6\\x97\\xa5\\xe6\\x9c\\xac\\xe8\\xaa\\x9e'") When I happen to know the encoding, I can reverse it much more cleanly. >>> str(sjis, 'shift-jis'), str(utf8, 'utf-8') ('日本語', '日本語') But I can't mix this approach with str instances without writing a different invocation. >>> str(u, 'argh') TypeError: decoding str is not supported -- Michael Urman _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com