2009/8/12 Ian Bicking <i...@colorstudy.com>: > On Tue, Aug 11, 2009 at 6:25 PM, Graham Dumpleton > <graham.dumple...@gmail.com> wrote: >> >> 2009/8/12 Henry Precheur <he...@precheur.org>: >> > Using bytes for all `environ` values is easy to understand on the >> > application side as long as you are aware of the encoding problem. The >> > cost is inconvenience, but that's probably OK. It's also simpler to >> > implement on the gateway/server side. >> >> Use of bytes everywhere can be inconvenient on the gateway/server >> side, at least as far as end result for user. >> >> The specific problem is that WSGI environment is used to hold >> information about the original request, as CGI variables, but also can >> hold user specified custom variables. >> >> In the case of anything hosted via Apache, such as through mod_wsgi, >> mod_fastcgi, mod_fcgid, mod_scgi and mod_cgi(d), users can set such >> custom variables using the SetEnv directive. Thus one might say: >> >> SetEnv trac.env_path /usr/local/trac/site-1 > > Just to clarify, there specifically is no type restrictions on extension > variables, which is any variable with a "." in it. The type restrictions > are solely for ALL_CAPS keys. You can put ints or unicode or whatever in > other variables. (Probably this doesn't make things any easier for > mod_wsgi, though; at least for this example)
If you want to change what the specification says from: """Finally, the environ dictionary may also contain server-defined variables. These variables should be named using only lower-case letters, numbers, dots, and underscores, and should be prefixed with a name that is unique to the defining server or gateway.""" to: """Finally, the environ dictionary may also contain server-defined variables. These variables MUST be named using only lower-case letters, numbers, dots, and underscores, and should be prefixed with a name that is unique to the defining server or gateway.""" then it is part the way as it least one is drawing a line between what is being construed as CGI variable and so would be bytes, and adapter/application variables which would be converted to string in what ever encoding makes sense for the server configuration system, with in the case of Apache would be UTF-8. The above description though would also have to be changed though, in as much as at the moment it says: """should be prefixed with a name that is unique to the defining server or gateway""" This isn't really in practice correct as the server configuration is just providing the mechanism for setting them and they may not necessarily be server or gateway variables, but variables a user is setting to customise the behaviour of the application. The way I read that line, strictly speaking, even though set as: SetEnv trac.env_path /usr/local/trac/site-1 it should be passed through as: mod_wsgi.trac.env_path which would be rather silly. Thus description needs to cater for fact that application variables may be settable from server configuration and passed through as is. Anyway, if the rule is that anything in upper case is treated as CGI and passed as bytes, and anything in lower case isn't and is passed as string, appropriately decoded, then that would eliminate one confusion point as far as expectations. It may not make it any easier for CGI under Python 3.0 though, where values would be all strings anyway. Now, is anyone willing to address the problem pointed out by others about where being able to return either bytes or strings (latin-1) for response headers is a pain for WSGI middleware to deal with? Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com