Hi, Ian Bicking schrieb: > I propose we switch primarily to "native" strings: str on both Python 2 and > 3. I'm starting to think that this is the best idea.
> I then propose that we eliminate SCRIPT_NAME and PATH_INFO. Instead we > have: IMO they should stick around for compatibility with older applications and be latin1 encoded on Python 3. But the use is discouraged. > Again, it would be better to do; > > parse_cookie(urllib.unquote(environ['HTTP_COOKIE']).decode('utf8')) That will only work in Python 2, in Python 2 urllib.unquote already yields unicode strings and assumes an utf-8 quoted string. > Other variables like environ['wsgi.url_scheme'], environ['CONTENT_TYPE'], > etc, will be native strings. A Python 3 hello work app will then look like: > > def hello_world(environ): > return ('200 OK', [('Content-type', 'text/html; charset=utf8')], ['Hello > World!'.encode('utf8')]) > > start_response and changes to wsgi.input are incidental to what I'm > proposing here (except that wsgi.input will be bytes); we can decide about > themseparately. If we go about dropping start_response, can we move the app iter to the beginning? That would be consistent with the signature of common response objects, making it possible to do this: response = Response(*hello_world(environ)) In general I think doing too many changes at once is harmful so I'm happy to stick with start_response for another iteration of WSGI. > Well, the biggie: is it right to use native strings for the environ values, > and response status/headers? Specifically, tricks like the latin1 > transcoding won't work in Python 2, but will in Python 3. Is this weird? > Or just something you have to think about when using the two Python > versions? The WSGI PEP should standardize a way for the application to figure out the environment it runs in. And that I think that should *not* be checking sys.version_info but rather comparing string features. > What happens if you give unicode text in the response headers that cannot be > encoded as Latin1? Undefined behavior, the example server should raise an assertion error. > Should some things specifically be ASCII? E.g., status. No, HTTP specifies the status as TEXT and TEXT is specified as (any 8-bit sequence of data except any US-ASCII control character but including CR, LR, space and tabs). > Should some things be unicode on Python 2? I don't think so. > Is there a common case here that would be inefficient? Don't think so. Regards, Armin _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com