RE: CoyoteRequest assumption that breaks UTF-8 support
Yes, in MY application I will do that. Namely, I will decide on one type of encoding to use and pull the arguments out in that encoding using getBytes(encType). That is the correct way of doing it. What I am talking about is Tomcat's built-in container-managed security. When you specify a security constraint of form-based login, and have a form that submits to j_security_check, and let Tomcat handle the details using one of its many Realms. This IS a bug, or at least a severe limit in functionality. As far as I can tell, there is no way to handle UTF-8 usernames and passwords doing container-managed security. Once again, unless I'm missing some way to tell the browser to send on the content-type to j_security_check. I have thoroughly investigated that, and think that I'm indeed not missing anything. Jeff Tulley ([EMAIL PROTECTED]) (801)861-5322 Novell, Inc., The Leading Provider of Net Business Solutions http://www.novell.com >>> [EMAIL PROTECTED] 9/25/03 12:09:02 PM >>> This is not a bug. Read http://asg.web.cmu.edu/rfc/rfc2070.html#sec-5. Use this code: String result = req.getParameter(parm); if (result == null) return null; try { return new String(result.getBytes("ISO-8859-1"), "UTF-8"); } catch (UnsupportedEncodingException e) { return result; } Dave Oxley [EMAIL PROTECTED] > -Original Message- > From: Jeff Tulley [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 25, 2003 6:40 PM > To: [EMAIL PROTECTED] > Subject: CoyoteRequest assumption that breaks UTF-8 support > > I am convinced that Tomcat 4.x has a very real bug when dealing with > character encodings on login form submissions. (maybe elsewhere as > well). > > To see what I am observing, watch the flow of code when doing a login > from Tomcat's admin application. It sets the charset to UTF-8 in the > tag > <%@ page language="java" contentType="text/html;charset=UTF-8" > import="java.util.*" %> at the top of the file. This correctly gets > the browser into unicode mode, and (with all of the browsers I have > tested - IE and NS 7 on windows, and Konqueror on SUSE Linux), they > correctly encode your username / password in UTF-8. > > But, in CoyoteRequest, parseRequestParameters(), Tomcat tries to > determine and set the encoding to be used and ends up setting it to be > null. Why? It looks like in Request.getCharacterEncoding and > subordinate methods, the header value, "content-type" is checked. On > all of the browsers that I have checked if this value is set at all, it > is simply set to "application/x-www-form-urlencoded", without mention of > UTF-8 or any other charset. The browser assumes that since you > requested a certain charset in the first place, that you know how to > deal with a parameter that is sent on encoded in that charset. > > Tomcat sees no mention of charsets, so it simply defaults to > "ISO-8859-1" (hardcoded constant) > > Is there a better way to tie the default value to whatever the JSP > login form originally requested, or even, failing that, can we look up > the system property, "file.encoding", and require it to be passed in on > startup. This is less than ideal though. Preferrably it would be > something settable context-wide, not tomcat-wide. > > Where does this leave applications that wish to support extended > characters in passwords and/or usernames, characters that may not be in > the ISO-8859-1 range? > > Or am I missing the correct place to set this? > > Jeff Tulley ([EMAIL PROTECTED]) > (801)861-5322 > Novell, Inc., The Leading Provider of Net Business Solutions > http://www.novell.com > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > This e-mail has been scanned for all viruses by Star Internet. The > service is powered by MessageLabs. For more information on a proactive > anti-virus service working around the clock, around the globe, visit: > http://www.star.net.uk > This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: CoyoteRequest assumption that breaks UTF-8 support
This is not a bug. Read http://asg.web.cmu.edu/rfc/rfc2070.html#sec-5. Use this code: String result = req.getParameter(parm); if (result == null) return null; try { return new String(result.getBytes("ISO-8859-1"), "UTF-8"); } catch (UnsupportedEncodingException e) { return result; } Dave Oxley [EMAIL PROTECTED] > -Original Message- > From: Jeff Tulley [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 25, 2003 6:40 PM > To: [EMAIL PROTECTED] > Subject: CoyoteRequest assumption that breaks UTF-8 support > > I am convinced that Tomcat 4.x has a very real bug when dealing with > character encodings on login form submissions. (maybe elsewhere as > well). > > To see what I am observing, watch the flow of code when doing a login > from Tomcat's admin application. It sets the charset to UTF-8 in the > tag > <%@ page language="java" contentType="text/html;charset=UTF-8" > import="java.util.*" %> at the top of the file. This correctly gets > the browser into unicode mode, and (with all of the browsers I have > tested - IE and NS 7 on windows, and Konqueror on SUSE Linux), they > correctly encode your username / password in UTF-8. > > But, in CoyoteRequest, parseRequestParameters(), Tomcat tries to > determine and set the encoding to be used and ends up setting it to be > null. Why? It looks like in Request.getCharacterEncoding and > subordinate methods, the header value, "content-type" is checked. On > all of the browsers that I have checked if this value is set at all, it > is simply set to "application/x-www-form-urlencoded", without mention of > UTF-8 or any other charset. The browser assumes that since you > requested a certain charset in the first place, that you know how to > deal with a parameter that is sent on encoded in that charset. > > Tomcat sees no mention of charsets, so it simply defaults to > "ISO-8859-1" (hardcoded constant) > > Is there a better way to tie the default value to whatever the JSP > login form originally requested, or even, failing that, can we look up > the system property, "file.encoding", and require it to be passed in on > startup. This is less than ideal though. Preferrably it would be > something settable context-wide, not tomcat-wide. > > Where does this leave applications that wish to support extended > characters in passwords and/or usernames, characters that may not be in > the ISO-8859-1 range? > > Or am I missing the correct place to set this? > > Jeff Tulley ([EMAIL PROTECTED]) > (801)861-5322 > Novell, Inc., The Leading Provider of Net Business Solutions > http://www.novell.com > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > This e-mail has been scanned for all viruses by Star Internet. The > service is powered by MessageLabs. For more information on a proactive > anti-virus service working around the clock, around the globe, visit: > http://www.star.net.uk > This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk
CoyoteRequest assumption that breaks UTF-8 support
I am convinced that Tomcat 4.x has a very real bug when dealing with character encodings on login form submissions. (maybe elsewhere as well). To see what I am observing, watch the flow of code when doing a login from Tomcat's admin application. It sets the charset to UTF-8 in the tag <%@ page language="java" contentType="text/html;charset=UTF-8" import="java.util.*" %> at the top of the file. This correctly gets the browser into unicode mode, and (with all of the browsers I have tested - IE and NS 7 on windows, and Konqueror on SUSE Linux), they correctly encode your username / password in UTF-8. But, in CoyoteRequest, parseRequestParameters(), Tomcat tries to determine and set the encoding to be used and ends up setting it to be null. Why? It looks like in Request.getCharacterEncoding and subordinate methods, the header value, "content-type" is checked. On all of the browsers that I have checked if this value is set at all, it is simply set to "application/x-www-form-urlencoded", without mention of UTF-8 or any other charset. The browser assumes that since you requested a certain charset in the first place, that you know how to deal with a parameter that is sent on encoded in that charset. Tomcat sees no mention of charsets, so it simply defaults to "ISO-8859-1" (hardcoded constant) Is there a better way to tie the default value to whatever the JSP login form originally requested, or even, failing that, can we look up the system property, "file.encoding", and require it to be passed in on startup. This is less than ideal though. Preferrably it would be something settable context-wide, not tomcat-wide. Where does this leave applications that wish to support extended characters in passwords and/or usernames, characters that may not be in the ISO-8859-1 range? Or am I missing the correct place to set this? Jeff Tulley ([EMAIL PROTECTED]) (801)861-5322 Novell, Inc., The Leading Provider of Net Business Solutions http://www.novell.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]