RE: RFC-2047 Header Character Set Encoding JK + Tomcat 5

2005-07-14 Thread Guernsey, Byron \(GE Consumer Industrial\)

For others who might be interested- and the tomcat developers should
correct me if I'm wrong since this goes into the archive, Tomcat 5.5.9
or  does not appear to support RFC-2047 for processing MIME-Headers
that use different character encodings besides ISO-8859-1. 

Searching through 1000's of lines of tomcat code, as best I could tell,
the code always assumes headers are of ISO-8859-1 type... from the
MimeHeaders class down to the ChunkByte class.  While both appear to
have the ability to specify encoding, they correctly assume the default
to be ISO and from what I could tell, the code parsing headers from the
Request does nothing to change this.

I could find no provisions for processing RFC-2047 compliant headers in
any of the connectors.  Listed here:
http://www.faqs.org/rfcs/rfc2047.html and referenced from the HTTP 1.1
RFC listed here: http://www.faqs.org/rfcs/rfc2616.html (see section 2.2
on basic rules for TEXT, and the definition of headers in section 4.2)
and references in JSR-154 servlet 2.4 spec.  Is Tomcat still considered
a reference implementation?

I hope this helps all who run into similar issues and can find no
information on them.  Now on to the Apache 2 source code to see if it
specifies the format required in the Header module API.

Byron

Keywords: International Headers UTF-8 ISO-8859-1 RFC-2047 

-Original Message-
From: Guernsey, Byron (GE Consumer  Industrial) 
Sent: Tuesday, July 12, 2005 4:16 PM
To: Tomcat Users List
Subject: RFC-2047 Header Character Set Encoding JK + Tomcat 5


Is there a FAQ on how Tomcat 5 and JK1 implement HTTP header character
sets? (ie, does it support RFC-2047)

We use some single sign-on plugin's at the web server (apache 2) that
set specific headers which may contain international characters.  The
headers are being returned by Tomcat to jsps/servlets in such a way that
the strings decode properly only if the browser is forced to view them
as UTF-8. 

This implies that the values are actually UTF-8 encoded, but improperly
assumed to be ISO-8859-1 as some point.

I have not yet tracked down which component in the chain is at fault. It
may very well be that the SSO plugin is calling the Apache API to set
Headers with UTF-8 values when they accept only ISO-8859-1 values, or
values encoded per RFC-2047.

I'd like to find out what mod_jk expects the header values to be when it
retrieves them from Apache, and whether Tomcat supports RFC-2047
decoding of header values.

If anyone has any experience with this, or can refer me to a discussion
or thread about this very item, I'd greatly appreciate the tip.  I'm not
looking forward to the amount of inspection I'm going to have to do to
find the culprit.

thanks,
Byron


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: RFC-2047 Header Character Set Encoding JK + Tomcat 5

2005-07-13 Thread Tim Funk

You may need to add this to your Connector declaration:
URIEncoding=UTF-8

-Tim

Guernsey, Byron (GE Consumer  Industrial) wrote:


Is there a FAQ on how Tomcat 5 and JK1 implement HTTP header character
sets? (ie, does it support RFC-2047)

We use some single sign-on plugin's at the web server (apache 2) that
set specific headers which may contain international characters.  The
headers are being returned by Tomcat to jsps/servlets in such a way that
the strings decode properly only if the browser is forced to view them
as UTF-8. 


This implies that the values are actually UTF-8 encoded, but improperly
assumed to be ISO-8859-1 as some point.

I have not yet tracked down which component in the chain is at fault. It
may very well be that the SSO plugin is calling the Apache API to set
Headers with UTF-8 values when they accept only ISO-8859-1 values, or
values encoded per RFC-2047.

I'd like to find out what mod_jk expects the header values to be when it
retrieves them from Apache, and whether Tomcat supports RFC-2047
decoding of header values.

If anyone has any experience with this, or can refer me to a discussion
or thread about this very item, I'd greatly appreciate the tip.  I'm not
looking forward to the amount of inspection I'm going to have to do to
find the culprit.

thanks,
Byron


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: RFC-2047 Header Character Set Encoding JK + Tomcat 5

2005-07-13 Thread Guernsey, Byron \(GE Consumer Industrial\)

Does URIEncoding affect all HTTP headers or only the URIs?

Thanks,
Byron


-Original Message-
From: Tim Funk [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, July 13, 2005 6:31 AM
To: Tomcat Users List
Subject: Re: RFC-2047 Header Character Set Encoding JK + Tomcat 5

You may need to add this to your Connector declaration:
URIEncoding=UTF-8

-Tim

Guernsey, Byron (GE Consumer  Industrial) wrote:

 Is there a FAQ on how Tomcat 5 and JK1 implement HTTP header character

 sets? (ie, does it support RFC-2047)
 
 We use some single sign-on plugin's at the web server (apache 2) that 
 set specific headers which may contain international characters.  The 
 headers are being returned by Tomcat to jsps/servlets in such a way 
 that the strings decode properly only if the browser is forced to view

 them as UTF-8.
 
 This implies that the values are actually UTF-8 encoded, but 
 improperly assumed to be ISO-8859-1 as some point.
 
 I have not yet tracked down which component in the chain is at fault. 
 It may very well be that the SSO plugin is calling the Apache API to 
 set Headers with UTF-8 values when they accept only ISO-8859-1 values,

 or values encoded per RFC-2047.
 
 I'd like to find out what mod_jk expects the header values to be when 
 it retrieves them from Apache, and whether Tomcat supports RFC-2047 
 decoding of header values.
 
 If anyone has any experience with this, or can refer me to a 
 discussion or thread about this very item, I'd greatly appreciate the 
 tip.  I'm not looking forward to the amount of inspection I'm going to

 have to do to find the culprit.
 
 thanks,
 Byron
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]