James Y Knight ha scritto: > I move to bless mod_wsgi's definition of WSGI 1.1 [1] > [...] > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X
Hi. Just a few questions. It is true that HTTP headers can be encoded assuming latin-1; and they can be encoded using PEP 383. However what about URI (that is, for PATH_INFO and the like)? For URI (if I remember correctly) the suggested encoding is UTF-8, so URLS should be decoded using url.decode('utf-8', 'surrogateescape') Is this correct? Now another question. Let's consider the `wsgiref.util.application_uri` function def application_uri(environ): url = environ['wsgi.url_scheme']+'://' from urllib.parse import quote if environ.get('HTTP_HOST'): url += environ['HTTP_HOST'] else: url += environ['SERVER_NAME'] if environ['wsgi.url_scheme'] == 'https': if environ['SERVER_PORT'] != '443': url += ':' + environ['SERVER_PORT'] else: if environ['SERVER_PORT'] != '80': url += ':' + environ['SERVER_PORT'] url += quote(environ.get('SCRIPT_NAME') or '/') return url There is a potential problem, here, with the quote function. This function does the following: def quote(string, safe='/', encoding=None, errors=None): if isinstance(string, str): if encoding is None: encoding = 'utf-8' if errors is None: errors = 'strict' string = string.encode(encoding, errors) This means that if we use surrogateescape, the informations about original bytes is lost here. This can be easily fixed by changing the application_uri function, but this also means that a WSGI application will not work with Python 3.1.x. Finally, a question about cookies. Cookie data SHOULD be transparent to the server/gateway; however WSGI is going to assume that data is encoded in latin-1. I don't know what the HTTP/Cookie spec says about this. However, from a WSGI application point of view, the cookie data can, as an example, contain some text encoded in UTF-8; this means that the application must first encode the data: cookie_bytes = cookie.encode('latin-1', 'surrogateescape') and then decode it using UTF-8: my_cookie_data = cookie_bytes.decode('utf-8') This is a bit unreasonable, but I don't know if this is a common practice (I do this, just to make an example). Manlio Perillo _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com