On 14 July 2010 14:43, Ian Bicking <i...@colorstudy.com> wrote: > So... there's been some discussion of WSGI on Python 3 lately. I'm not > feeling as pessimistic as some people, I feel like we were close but just > didn't *quite* get there.
What I took from the discussion wasn't that one couldn't specify a WSGI interface, and as you say we more or less have one now, the issue is more about how practical that is from a usability perspective for those who have to code stuff on top. The concern seems to be that although it may be easy to work with the specification for those who at the lowest layer immediately wrap it in a higher level abstraction that normalises stuff into something that is then used consistently in that way, for those who use lower level raw WSGI right through the stack, especially in the context of stackable WSGI middleware, that repetitive task of having to deal with the byte/unicode issues at every point it just a big PITA. That said, my job in writing the WSGI adapter is really easy as I don't have to worry about these issues. This is why I don't seem to really appreciate the concerns people are expressing. The above is how I read things though. > Here's my thoughts: > > * Everyone agrees keys in the environ should be native strings > * Bodies should stay bytes > * Can we make all "standard" values that are str on Python 2, str on Python > 3 with a Latin1 encoding? This is basically what wsgiref did. This means > HTTP_*, SERVER_NAME, etc. Everything CGIish, and everything with an > all-caps key. There's only a couple tricky keys: SCRIPT_NAME, PATH_INFO, > and HTTP_COOKIE. > * I propose we let libraries handle HTTP_COOKIE however they want; don't > bother transcoding *into* the environ, just do so when you parse the cookie > (if you so choose). Happy developers will just urlencode all their cookie > values to keep their cookies ASCII-clean. Unhappy developers who have to > handle legacy cookies will just run environ['HTTP_COOKIE'].decode('latin1') > and then do whatever sad magic they are forced to do. > * I (re)propose we eliminate SCRIPT_NAME and PATH_INFO and replace them > exclusively with encoded versions (that represent the original request > URI). We use Latin1 encoding, but it should be ASCII anyway, like most of > the headers. > * I'm terrible at naming, but let's say these new values are RAW_SCRIPT_NAME > and RAW_PATH_INFO. My prior suggestion on that since upper case keys for now effectively derive from CGI, was to make them wsgi.script_name and wsgi.path_info. Ie., push them into the wsgi namespace. > Does this solve everything? There's broken stuff in the stdlib, but we > shouldn't bother ourselves with that -- if we need working code we should > just write it and ignore the stdlib or submit our stuff as patches to the > stdlib. The quick summary of what I suggest before is at: http://code.google.com/p/modwsgi/wiki/SupportForPython3X I believe the only difference I see is the raw SCRIPT_NAME and PATH_INFO, which got discussed to death previously with no consensus. > Some environments will have a hard time constructing RAW_SCRIPT_NAME and > RAW_PATH_INFO, but in my opinion they can just encode SCRIPT_NAME and > PATH_INFO and be done with it; it's not as accurate, but it's no less > accurate than what we have now. > > Actual transcoding in the environ is not supported or encouraged in this > scheme. If you want to adjust an encoding you should do it in your > application/library code. > > There's some other topics, like chunked responses, unknown request body > lengths, start_response, and maybe some other things, but these aren't > Python 3 issues, they are just... generic issues. app_iter.close() might be > worth thinking about given new iterator semantics introduced since WSGI was > written. Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com