Re: [Python-Dev] urllib.quote and unicode bug resuscitation attempt

Anthony Baxter Tue, 11 Jul 2006 19:43:25 -0700

On Wednesday 12 July 2006 07:16, Martin v. Löwis wrote:
> Stefan Rank wrote:
> > I suggest to add (after 2.5 I assume) one of the following to the
> > beginning of urllib.quote to either fail early and consistently
> > on unicode arguments and improve the error message::
> >
> >    if isinstance(s, unicode):
> >        raise TypeError("quote needs a byte string argument, not
> > unicode," " use `argument.encode('utf-8')` first.")
> >
> > or to do The Right Thing (tm), which is utf-8 encoding::
>
> The right thing to do is IRIs. This is more complicated than
> encoding the Unicode string as UTF-8, though: for the host part of
> the URL, you have to encode it with IDNA (and there are additional
> complicated rules in place, e.g. when the Unicode string already
> contains %).
>
> Contributions are welcome, as long as they fix this entire issue
> "for good" (i.e. in all URL-processing code, and considering all
> relevant RFCs).


For 2.5, should we at least detect that it's unicode and raise a 
useful error?


-- 
Anthony Baxter     <[EMAIL PROTECTED]>
It's never too late to have a happy childhood.
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] urllib.quote and unicode bug resuscitation attempt

Reply via email to