Guido van Rossum wrote: > Currently (in 3.0), "".join(<seq>) automatically applies str() to the > items of <seq>, *except* if the item is a bytes instance -- then it > raises a TypeError. Is that proper behavior? The alternative is to > uniformly apply str(), which for bytes returns a string of the form > "b'...'" or "buffer(b'...')" (depending on whether the bytes are > immutable or not). Given that we killed the exception for "" == b"" > earlier, I'm tempted to remove the exception. Any opinions to the > contrary?
-1 In Python 2.x the implicit encoding of a string with sys.getdefaultencoding() caused me more than one headache. If fear the implicit conversion of a byte sequence to its representation may cause similar problems. If we take one step down that road we can't go back again. ''.join() could grow an encoding argument but that's ugly, too. ''.join(s if isinstance(s) else str(s, 'utf-8') for s in seq) works for me. :) However I like b''.join, buffer().join and the other methods to accept buffers and bytes. I don't see a reason why the methods shouldn't accept them. >>> b"".join((b'1', b'2')) b'12' >>> b"".join((buffer(b'1'), buffer(b'2'))) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sequence item 0: expected string, buffer found >>> buffer().join((buffer(b'1'), buffer(b'2'))) buffer(b'12') >>> buffer().join((b'1', b'2')) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can only join an iterable of bytes (item 0 has type 'bytes') Christian _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
