> A middleware might re-decode the values if the `wsgi.uri_encoding` is
> `iso-8859-1` and only then.
Seems like a mistake. If the middleware knows iso-8859-7 is in use, it
would need to transcode the charset regardless of whether the
initially-submitted bytes were a valid UTF-8 sequence or not. Otherwise
the application would break when fed with eg. Greek words that happened
to encode to valid UTF-8 bytes.
> The application MUST use this value to decode the ``'QUERY_STRING'``
> as well.
This will break all use of non-UTF-8 encodings in QUERY_STRING, where
the path part of the URL does not contain non-UTF-8 sequences. That
includes the very common case where the path part contains only ASCII.
http://greek.example.com/myscript.cgi?x=%C2
will fail, as the given UTF-8 sniffer only looks at the path part to
determine what encoding to use for both of the path part and the query
string. I don't think WSGI should mandate any particular decoding of the
QUERY_STRING.
To be honest, I'm still uncomfortable with any use of Unicode strings in
WSGI. But if we're going to do it, I'd go for consistency. Treating the
decoding of the URL specially is a nasty hack that is only there because
the CGI spec stupidly requires %-decoding to be done on PATH_INFO and
SCRIPT_NAME.
So why not go with (the long-ago suggested) optional variables like
'wsgi.real_path_info' that, if present, are the original strings before
%-decoding? Now it doesn't greatly matter what string types and
encodings we pick, because everything will be ASCII anyway. It also
solves the %2F problem.
If those variables are not present (typically for CGI environments that
cannot provide them), the application/framework *may* try recover
non-ASCII characters from PATH_INFO/QUERY_STRING, with undefined
results. This is the broken-but-sometimes-rescuable status quo for CGI:
by the time Python reads non-ASCII characters out of the environment
they may already have been mangled by up to two conversion processes.
--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com