Philippe Verdy scripsit: > Not bogous: the HTTP header is less important than an explicit > declaration in the XML document.
You've misread me or RFC 3023 or both. The charset parameter in the MIME header *overrides* the encoding declaration in the XML content. If the header says "ISO 8859-1", then the character encoding of the contents is ISO 8859-1, no matter what the encoding declaration says or doesn't say. What is even worse is that if the media type is text/xml (as opposed to application/xml), and the charset parameter is not specified, the character encoding of the contents is US-ASCII, again no matter what the encoding declaration says or doesn't say. > The default UTF-8/UTF-16 only applies to the case where there is > *neither* a XML declaration, *nor* an external meta-data declaration > such as HTTP headers. Correct. > However the BOM may be omitted from the "UTF-16" encoding scheme, > and in that case it MUST be decoded only as UTF-16BE. Actually, RFC 2781 says "SHOULD" in that case, not "MUST". I agree that this should (or even must) be strengthened in future. -- John Cowan [EMAIL PROTECTED] www.ccil.org/~cowan www.reutershealth.com I must confess that I have very little notion of what [s. 4 of the British Trade Marks Act, 1938] is intended to convey, and particularly the sentence of 253 words, as I make them, which constitutes sub-section 1. I doubt if the entire statute book could be successfully searched for a sentence of equal length which is of more fuliginous obscurity. --MacKinnon LJ, 1940

