2009/4/2 Robert Brewer <fuman...@aminus.org>: > Graham Dumpleton wrote: >> 2009/4/2 Robert Brewer <fuman...@aminus.org>: >> > Alan Kennedy wrote: >> >> Hi Graham, >> >> >> >> I think yours is a good solution to the problem. >> >> >> >> [Graham] >> >> > In other words, leave all the existing CGI variables to come >> through >> >> > as latin-1 decode >> >> >> >> As latin-1 or rfc-2047 decoded, to unicode. >> >> >> >> > and do anything new in 'wsgi' variable namespace, >> >> >> >> So the server provides >> >> >> >> "wsgi.server_decoded_SCRIPT_NAME" == u"whatever" >> >> "wsgi.server_decoded_PATH_INFO" == u"whatever" >> >> "wsgi.server_decode_charset" == u"utf-8" >> > >> > I think everyone at the sprint today acquiesced to having >> > SCRIPT_NAME/PATH_INFO/QUERY_STRING be set in the environ as unicode. >> The >> > server can decide (probably subject to configuration). I've >> implemented >> > this in the python3 branch of CherryPy and it seems to work >> brilliantly. >> > Assuming the server *is* configurable, deployers should be able to >> > choose Latin-1 if they need to recover the original bytes, without >> > having to support a separate set of encoded-byte entries. >> >> Seems to me that you can't have it be configurable and it must always >> be latin-1 interpretation. The problem is where you are composing >> multiple WSGI applications. If they each have different expectations >> or requirements as to how it is handled, aren't you going to have a >> problem. Or am I missing something in the way you are explaining it? > > I would not expect multiple middlewares to want to decode the same URI > differently.
I was not thinking about multiple middlewares, but multiple distinct WSGI applications (end consumer, not middleware) composited together by something like Paste cascade, Pylons configuration or even something like a routes based dispatcher. In the case of something like cascade they aren't necessarily on different URLs. For the later they would be, even so, just making sure that having different URLs with different encodings isn't going to be an issue in respect of mapping middleware. So long as code/config files are always UTF-8 encoded and capable of representing any possible decodings of URL, then probably okay. Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com