Guido van Rossum <gu...@python.org> wrote: > On Wed, Apr 1, 2009 at 5:18 AM, Robert Brewer <fuman...@aminus.org> wrote: > > Good timing. We had been thinking to make everything strings except for > > SCRIPT_NAME, PATH_INFO, and QUERY_STRING, since these few are pulled > > from the Request-URI, which may be in any encoding. It was thought that > > the app would be best-qualified to decode those three. > > Argh. The *meaning* of these fields is clearly text.
I wouldn't read too much into those names -- they were chosen when the CGI spec was just gestating, long before the usage patterns solidified, and don't necessarily reflect the usage of the data bound to them. I believe this work was done before the formal IETF definition of a URL, for instance. I think the controlling reference here is RFC 3875. It's not at all clear to me what the SCRIPT_NAME is. Is it a pathname, involving the local file system's filenames, which recent discussions seem to indicate may or may not correspond to human-notional strings, or a URI path? I'm OK with calling it text, with a proviso that there may be cases where it's not. I've never actually seen a CGI call with PATH_INFO set; I think it's obsolete usage (but pretty clearly a string). RFC 3875 says, "Similarly, treatment of non US-ASCII characters in the path is system-defined." QUERY_STRING -- should always be an ASCII string. May indeed encode non-Unicode strings or purely binary data, but when passed to the CGI script, it's still encoded as it was in the URI. Bill _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com