Martin v. Löwis wrote: > Users do > > py> "Martin v. Löwis".encode("utf-8") > Traceback (most recent call last): > File "<stdin>", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method. > > What it *should* tell them is > > py> "Martin v. Löwis".encode("utf-8") > Traceback (most recent call last): > File "<stdin>", line 1, in ? > AttributeError: 'str' object has no attribute 'encode'
I think it would be even better if they got "ValueError: utf8 can only encode unicode objects". AttributeError is not much more clear than the UnicodeDecodeError. That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a unicode object (instead of implicitly encoding the string with the ascii encoding) that would be fine. The pipe-like nature of .encode and .decode works very nicely for certain transformations, applicable to both unicode and byte objects. Let's not throw the baby out with the bath water. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com