On Mon, Sep 29, 2008 at 5:14 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Adam Olsen wrote:
>> There's no solution except to not
>> decode, and 8859-1 is the way to do that.
>
> I think you need to elaborate that. What does ISO-8859-1 has to do
> with a Python datatype in this context: which datatype, and what
> algorithm on it are you specifically referring to?
>
> When I do (in 2.x)
>
> py> "foo".decode("iso-8859-1")
> u'foo'
>
> ISTM that 8859-1 is all about decoding, so I don't understand why
> you say it is a way not to decode.

8859-1 has no invalid bytes and is a 1-to-1 mapping.  If you have an
API that always returns unicode but accepts an encoding you can use
it, then reencode using 8859-1 to get back the original bytes.

An ugly hack, but more correct than UTF-8b or any similar attempt to
do "unicode but not quite unicode"; either it's lossy, or it's not
unicode.  There's no in between.


-- 
Adam Olsen, aka Rhamphoryncus
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to