Armin Ronacher wrote:

The middleware can never know.

It's much more likely than to know than the server though!

> WSGI will demand UTF-8 URLs and only
> provide iso-XXX support for backwards compatibility.

It doesn't sound much like backwards compatibility to me if non-UTF-8 URLs break as soon as they coincidentally happen to be UTF-8 byte sequences. I'm as much an advocate of "UTF-8 for everything everywhere!" as anyone else, but unfortunately today there are still dark places where you need non-UTF-8 URLs.

Incidentally, if wsgi.uri_encoding is going to be the way to signal that the server has decoded bytes to characters using a known encoding, it should be stressed that this should only be set when that encoding is certain.

That is, wsgi.uri_encoding should be omitted (or None?) in cases where another party has already decoded (and maybe mangled) the bytes using an unknown encoding. In particular, CGI.

(In the case of Windows CGI the server will have decoded URI bytes into Unicode characters, using a charset which it is impossible to find out. In Apache it's iso-8859-1; in IIS it's UTF-8 as long as it was a valid UTF sequence, otherwise it's the system codepage. This problem affects the non-CGI implementation isapi_wsgi, too. Then the variables are read as environment variables, which for Python 2 means another encode/decode step on Windows using the system codepage, mangling non-codepage characters. Python 3 has the opposite problem reading byte envvars using UTF-8, which won't be how Apache put them there.)

If wsgi.encoding is obligatory then in reality it will often be wrong, leaving us in the same pathetic predicament as with WSGI 1.0, where non-ASCII URIs don't work reliably at all.

--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/

_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to