Stefan Rank wrote:
> I suggest to add (after 2.5 I assume) one of the following to the
> beginning of urllib.quote to either fail early and consistently on
> unicode arguments and improve the error message::
>
> if isinstance(s, unicode):
> raise TypeError("quote needs a byte string argument, not unicode,"
> " use `argument.encode('utf-8')` first.")
>
> or to do The Right Thing (tm), which is utf-8 encoding::
The right thing to do is IRIs. This is more complicated than encoding
the Unicode string as UTF-8, though: for the host part of the URL, you
have to encode it with IDNA (and there are additional complicated rules
in place, e.g. when the Unicode string already contains %).
Contributions are welcome, as long as they fix this entire issue "for
good" (i.e. in all URL-processing code, and considering all relevant
RFCs).
Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com