And Clover ha scritto:
> Manlio Perillo wrote:
> 
>> Words of *TEXT MAY contain characters from character sets other than
>> ISO-8859-1 [22] only when encoded according to the rules of RFC 2047
> 
> Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to
> RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 itself
> specifically denies that an encoded-word can go in a quoted-string.
> 
> RFC2047 encoded-words are not on-topic in an HTTP header(*); this has
> been confirmed by newer development work on HTTPbis by Reschke et al.
> (http://tools.ietf.org/wg/httpbis/).
> 

Thanks.
HTTPbis seems to fix all these problems:

"Historically, HTTP has allowed field content with text in the ISO-
8859-1 [ISO-8859-1] character encoding and supported other character
sets only through use of [RFC2047] encoding.  In practice, most HTTP
header field values use only a subset of the US-ASCII character
encoding [USASCII].  Newly defined header fields SHOULD limit their
field values to US-ASCII characters.  Recipients SHOULD treat other
(obs-text) octets in field content as opaque data."


This is the new rule for `quoted-string`:

quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext         = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
               ; OWS / <VCHAR except DQUOTE and "\"> / obs-text
obs-text       = %x80-FF

quoted-pair    = "\" ( WSP / VCHAR / obs-text )


> The "correct" way of escaping header parameters in an RFC*822-family
> protocol would be RFC2231's complex encoding scheme, but HTTP is
> explicitly not an 822-family protocol despite sharing many of the same
> constructs. See
> http://tools.ietf.org/html/draft-reschke-rfc2231-in-http-06 for a
> strategy for how 2231 should interact with HTTP, but note that for now
> RFC2231-in-HTTP simply does not exist in any deployed tools.
> 

It seems reasonable.

> So for now there is basically nothing useful WSGI can do other than
> provide direct, byte-oriented (even if wrapped in 8859-1 unicode
> strings) access to headers.
> 

Yes, this is what I think.
I have some doubts about wrapping the headers in 8859-1 unicode strings,
but luckily there is surrogateescape.



Regards  Manlio
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to