Henry Precheur ha scritto: > On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote: >> There is something that I don't understand. >> >> Some HTTP headers, like Accept-Language, contains data described as >> `token`, where: >> >> token = 1*<any CHAR except CTLs or separators> >> >> So a token, IMHO, is an opaque string, and it SHOULD not decoded. >> In Python 3.x it SHOULD be a byte string. > > I think this is more an issue that frameworks should deal with. By > decoding every headers value to latin-1: > > * It keeps WSGI simple. Simple is good. >
It is just as simple as using byte strings, IMHO. It is not simple, it is convenient because of (if I understand correctly) how code is converted by 2to3. > * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) > says. WSGI is about HTTP, but that doesn't necessarily includes all > other standards extending HTTP. > HTTP never says to consided whole headers as latin-1 text, IMHO. > * It's possible to convert latin-1 strings to bytes without losing data. > Yes, but it is quite stupid to first convert to Unicode and then convert again to byte string. It it true, however, that this does not happen often; but only for: - WSGI applications that implement an HTTP proxy - WSGI applications that needs to support HTTP Digest Authentication - WSGI applications that store encoded data in cookies Regards Manlio _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com