Re: [Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-18 Thread Andrew Clover
ctypes.windll.kernel32.GetEnvironmentVariableW(u'PATH_INFO', ...) Hmm... it turns out: no. IIS appears to be mangling characters that are not in mbcs even *before* it puts the decoded value into the envvars. The same is true with isapi_wsgi, which is the only other WSGI adapter I know of

Re: [Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-17 Thread Andrew Clover
Mark Hammond wrote: I don't think Python explicitly converts it - the CRT's ANSI version of environ is used Yes, it would be the CRT on Python 2.x. (Python 3.0 on non-NT does a conversion always using UTF-8, if I'm reading convertenviron right.) so the resulting strings should be encoded

Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-17 Thread Andrew Clover
Ian Bicking wrote: To resolve this, let's just not pass it over this time? +1 -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe:

Re: [Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-14 Thread Andrew Clover
Ian Bicking wrote: As it is (in Python 2), you should do something like environ['PATH_INFO'].decode('utf8') and it should work. See the test cases in my original post: this doesn't work universally. On WinNT platforms PATH_INFO has already gone through a decode/encode cycle which almost

Re: [Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-14 Thread Andrew Clover
Ian Bicking wrote: This is something messed up with CGI on NT, and whatever server you are using, and perhaps the CGI adapter (maybe there's a way to get the raw environment without any encoding, for example?) Python decodes the environ to its own copy (wrapped in os.environ) at interpreter

[Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-12 Thread Andrew Clover
It would be lovely if we could allow WSGI applications to reliably accept Unicode paths. That is to say, allow WSGI apps to have beautiful URLs like Wikipedia's, without requiring URL-rewriting magic. (Which is so highly server-specific, potentially unavailable to non-admin webmasters, and

Re: [Web-SIG] problem with wsgiref.util.request_uri and decoded uri

2008-09-10 Thread Andrew Clover
Manlio Perillo wrote: On the other hand, if the WSGI gateway *do* decode the uri, I can no more handle '/' in uri. Correct. CGI requires that '%2F' is decoded, and hence indistinguishable from '/' when it gets to the application. And WSGI inherits CGI's flaws for compatibility.

Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-07 Thread Andrew Clover
James Y Knight wrote: In addition, I know of nobody who actually implements RFC 2047 decoding of http header values...nothing really uses it. (of course I don't know of all implementations out there.) Certainly no browser supports it, which makes the point moot for WSGI. Most browsers, when

Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-07 Thread Andrew Clover
Adam Atlas [EMAIL PROTECTED] wrote: I'd say it would be best to only accept `bytes` objects +1. HTTP is inherently byte-based. Any translation between bytes and unicode characters should be done at a higher level, by whatever web framework is living above WSGI. -- And Clover mailto:[EMAIL