Re[2]: [Quixote-users] Unicode improvements

Alexander J. Kozlovsky Tue, 30 Aug 2005 17:28:55 -0700

Hello again!

>     The "charset" parameter is used with some media types to define
>     the character set (section 3.4) of the data. When no explicit
>     charset parameter is provided by the sender, media subtypes of
>     the "text" type are defined to have a default charset value of
>     "ISO-8859-1" when received via HTTP. Data in character sets
>     other than "ISO-8859-1" or its subsets MUST be labeled with an
>     appropriate charset value. See section 3.4.1 for compatibility
>     problems.


I think, this paragraph has next meaning:

1. If user agent don't receive Content-Encoding header within
   response, then user agent SHOULD threat response encoding
   as "ISO-8859-1".

2. IF HTTP server send response encoded with "ISO-8859-1" then
   Content-Encoding header MAY be omitted, in other responses
   it is REQUIRED.

That's all

If response contains Content-Encoding header, Quixote may use any
appropriate encoding. If response text is unicode, then UTF-8 is
appropriate encoding. If Quixote send Content-Encoding header with
"UTF-8" encoding, it conforms HTTP standard.

That is, Hamish Lawson is correct:

> When charset has not been specified, Quixote is in fact faced
> with two different questions with respect to str and unicode
> objects... For unicode objects Quixote is deciding which encoding
> it *will* choose to use; ISO-8859-1 is a poor choice as it will be
> unable to encode the majority of Unicode character points;
> instead UTF-8 is a much more natural choice for unicode objects.


Best regards,
 Alexander                            mailto:[EMAIL PROTECTED]

_______________________________________________
Quixote-users mailing list
[email protected]
http://mail.mems-exchange.org/mailman/listinfo/quixote-users

Re[2]: [Quixote-users] Unicode improvements

Reply via email to