Henry Precheur ha scritto:
> On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote:
>> There is something that I don't understand.
>>
>> Some HTTP headers, like Accept-Language, contains data described as
>> `token`, where:
>>
>> token          = 1*<any CHAR except CTLs or separators>
>>
>> So a token, IMHO, is an opaque string, and it SHOULD not decoded.
>> In Python 3.x it SHOULD be a byte string.
> 
> I think this is more an issue that frameworks should deal with. By
> decoding every headers value to latin-1:
> 
> * It keeps WSGI simple. Simple is good.
> 

It is just as simple as using byte strings, IMHO.
It is not simple, it is convenient because of (if I understand
correctly) how code is converted by 2to3.

> * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1)
>   says. WSGI is about HTTP, but that doesn't necessarily includes all
>   other standards extending HTTP.
> 

HTTP never says to consided whole headers as latin-1 text, IMHO.

> * It's possible to convert latin-1 strings to bytes without losing data.
> 

Yes, but it is quite stupid to first convert to Unicode and then convert
again to byte string.

It it true, however, that this does not happen often; but only for:

- WSGI applications that implement an HTTP proxy
- WSGI applications that needs to support HTTP Digest Authentication
- WSGI applications that store encoded data in cookies


Regards  Manlio
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to