Hi,

thanks for all the trouble explaining. I'll look into this some more. What I
already did that didn't really work:
- make sure there is a <?xml ..> declaration at the start of the file
- make sure there is a <meta content-type UTF-8> as the first item in
<html><head>

In the latter case: IE simply adds it's own <meta content-type iso-8859-1>
above mine. :-(

Thanks and bye, Helma


> -----Original Message-----
> From: J.Pietschmann [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, 08 January 2004 21:07
> To: [EMAIL PROTECTED]
> Subject: Re: Sudden difference in interpretation of #160 - update
> 
> 
> [EMAIL PROTECTED] wrote:
> > Internet Explorer is on "encoding: auto-select" which
> > defaults to ISO 8859-1 with or without <metatag 
> content-type UTF-8> in the
> > page header. Same goes for Opera.
> 
> There are multiple ways for a browser to get hints for the encoding:
> 1. The HTTP content-type header, which optionally specifies a 
> character
>   set. Unfortunately, the default character set is 
> iso-8859-1. See, for
>   example
>    http://www.jorendorff.com/articles/unicode/unicode-http.html
> 2. An XML declaration, in case the content type is 
> application/xhtml+xml
>   (or XML or another XML based content). Unfortunately, if there is an
>   XML declaration without an encoding, the default is either UTF-8 or
>   UTF-8/UTF-18 autodetection, at the discretion of the consumer
> 3. A META-element in the (X)HTML head defining an encoding. 
> There is no
>   default in case there is no such element.
> IEx adds its own guess, and all browsers add manual overide 
> by the user.
> 
> If you send UTF-8 encoded content, and both IEx and Opera detect it as
> ISO-8859-1, most likely there is no character set defined in the HTTP
> content type header, no XML decl (applies only to XHTML) and no META
> element declaring an encoding. The HTTP header should be set by the
> serializer, but there may be a servlet container override in place.
> 
> I'd sniff the HTTP headers in order to check whether the content type
> header properly declares an UTF-8 encoding, absent this, I'd either
> - track down why this isn't set and fix this or
> - add a META element declaring an UTF-8 encoding, which probably works
>   only if the content-type doesn't define a character set.
> 
> J.Pietschmann
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to