On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote: > Phillip J. Eby wrote: > [snip..] > > > > In fact, the 'encoding' argument seems useless in the case of str objects, > > and it seems it should default to latin-1 for unicode objects. The only > > > -1 for having an implicit encode that behaves differently to other > implicit encodes/decodes that happen in Python. Life is confusing enough > already.
But adding an encoding doesn't help. The str.encode() method always assumes that the string itself is ASCII-encoded, and that's not good enough: >>> "abc".encode("latin-1") 'abc' >>> "abc".decode("latin-1") u'abc' >>> "abc\xf0".decode("latin-1") u'abc\xf0' >>> "abc\xf0".encode("latin-1") Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 3: ordinal not in range(128) >>> The right way to look at this is, as Phillip says, to consider conversion between str and bytes as not an encoding but a data type change *only*. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com