See inline. ----- Original Message ----- From: "Tony LaPaso" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Monday, November 10, 2003 8:15 PM Subject: TC 5.0.14 Breaks UTF-8 Content via HTTP Header
> Hi everyone, > > It seems a change to TC v5.0.14 may have broken the serving of UTF-8 > documents. Specifically, one of the HTTP headers seems wrong. I'd like to > describe what I'm seeing TC v5.0.14 compared with v5.0.12. > > For both v5.0.12 and v5.0.14 I'm running TC as it comes "out of the box" > i.e., with no changes to the default configurations. > > In both cases I tested with four browsers (IE 5, IE 6, Netscape 7.1 and > Firebird 0.7), all on Win 2K. > > > Here's What I Did > ----------------- > In both versions of TC, I added an "em dash" character to the > "/tomcat-docs/cgi-howto.html" documents that come with the TC documentation. > The UTF-8 representation for the "em dash" character is the three bytes > 0xE28094. I also made sure both documents had the following META tag in its > <head>: > > <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> > > I then saved the documents as UTF-8 (without a BOM). Finally, I brought the > document into a hex editor to check that the em dash was properly encoded as > three bytes (which it was). This indicated to me that the document was > indeed encoded as UTF-8. > > > Here's What I Saw (TC v5.0.12) > ------------------------------ > Under TC v5.0.12, everything looked great using all browsers -- the "em > dash" was rendered correctly. I put a sniffer on the HTTP stream. The > v5.0.12 Coyote Connector was sending this HTTP response header: > Content-Type: text/html > > > Here's What I Saw (TC v5.0.14) > ------------------------------ > Under TC v5.0.14 the "em dash" character was rendered as *THREE SEPARATE > CHARACTERs* (one for each byte). Moreover, putting a sniffer on the HTTP > stream indicated the following response header was being sent by the v5.0.14 > Coyote Connector: > Content-Type: text/html;charset=ISO-8859-1 > > > Aside > ----- > For the heck of it I re-saved the v5.0.14 UTF-8 document with a BOM > (0xEFBBBF). Doing this made IE correctly render it but Netscape and Firebird > still had problems. I'm pretty sure that Unicode says the BOM is optional > anyway. > > > Conclusion (?) > -------------- > It seems that v5.0.14 of the Coyote Connector is incorrectly sending the > wrong response header. I'm not sure what the HTTP spec says *should* be sent > for the header if the document's <head> contains: The spec says nothing about META tags. Tomcat (correctly) treats then as just so much output text. > > <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/> > > My guess is that either the response header in v5.0.14 needs to be changed > to: > Content-Type: text/html;charset=UTF-8 > > or possibly: > > Content-Type: text/html > > as it was with TC v5.0.12. > > Can anyone comment? Is this a TC v5.0.14 bug? It would seem to be. It looks like a 5.0.12 bug, that was subsequently fixed :). The 2.4 Servlet-spec clearly states: <spec-quote version="Servlet-2.4-pfd3" section="14.2.22"> If no character encoding has been specified, ISO-8859-1 is returned. </spec-quote> > > Thanks, > > Tony > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
This message is intended only for the use of the person(s) listed above as the intended recipient(s), and may contain information that is PRIVILEGED and CONFIDENTIAL. If you are not an intended recipient, you may not read, copy, or distribute this message or any attachment. If you received this communication in error, please notify us immediately by e-mail and then delete all copies of this message and any attachments. In addition you should be aware that ordinary (unencrypted) e-mail sent through the Internet is not secure. Do not send confidential or sensitive information, such as social security numbers, account numbers, personal identification numbers and passwords, to us via ordinary (unencrypted) e-mail.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]