On Tue, Sep 22, 2009 at 11:26:15PM -0400, P.J. Eby wrote:
> +1, if you mean the strings have the same content, 
> character-for-character on Python 2.3.  That is, a \x80 byte in a 
> Python 2 'str' is matched by an \x80 character in the Python 3 
> 'str'.  (I presume that's what we mean by "native", but I want to be sure.)

It is the case (Python 3 code):

    >>> ord(b'\x80'.decode('latin1')) == b'\x80'[0]

Also I'd like to point out that the "Cookie problem" could be more
general than we think. HTTP_COOKIE is the only header we have identified
so far with a weird encoding scheme. But I am pretty sure some idiots
have or will create other weird headers with strange encoding scheme
--"let's mix UTF-8 & latin1 just for the fun of it".

By defaulting to latin-1 it will ensure that WSGI is solid enough to
face these weird situations.

I stronly backs the use of a single encoding. The proposed
wsgi.uri_encoding method doesn't seem to add anything compared to

Ian's proposal seems to be fairly complete and address all the issue we
had, with the exception of the outstanding issues he pointed out at the
end of his mail.

  Henry PrĂȘcheur
Web-SIG mailing list
Web SIG: http://www.python.org/sigs/web-sig

Reply via email to