By using a <meta> element, do you mean this:Nope, it is not a bug. The browser hasn't set the encoding, so the method returns null to indicate this.
<meta http-equiv="Content-Type" content="test/html; charset=utf-8">
That doesn't seem to work when the form is multipart/form-data, because the Content-Type header still just has multipart/form-data. The problem seems to be that when I do a request.getCharacterEncoding(), I get null. Is that normal? I would think I should at least get the default character encoding for the webapp. I am using Resin 2.1.10. This might be an issue for me to report to them.
This definately is what is causing my problem, because if I look at the code in org.apache.struts.upload.CommonsMultipartRequestHandler.addTextParameter(), this is the first thing is does:
try { value = item.getString(request.getCharacterEncoding()); } catch (Exception e) { value = item.getString(); }
Since request.getCharacterEncoding() is null, I assume an Exception is being throw an caught (a log.warn() might be nice there) and then I am get getting the string without decoding it from UTF-8.
If I manually set the characterEncoding to UTF-8 before this code executes (in processMultipart() in the requestProcessor for example), then everything works fine.
So I guess my question is should I be expecting request.getCharacterEncoding() to return null or is there a bug in my app server?
Here is the relevant section from Java™ Servlet Specification Version 2.3 (servlet-2_3-fcs-spec.pdf)
"SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the Content-Type header, leaving open the determination of the character encoding for reading HTTP requests. The default encoding of a request the container uses to create the request reader and parse POST data must be “ISO-8859-1”, if none has been specified by the client request. However, in order to indicate to the developer in this case the failure of the client to send a character encoding, the container returns null from the getCharacterEncoding method.
If the client hasn’t set character encoding and the request data is encoded with a different encoding than the default as described above, breakage can occur. To remedy this situation, a new method setCharacterEncoding(String enc) has been added to the ServletRequest interface. Developers can override the character encoding supplied by the container by calling this method. It must be called prior to parsing any post data or reading any input from the request. Calling this method once data has been read will not affect the encoding."
Martin Cooper wrote:
In Struts 1.1, the default file upload mechanism *is* Commons FileUpload. ;-)
It seems that you may have omitted to tell the browser explicitly that your pages are in UTF-8. For some reason that I've never fully understood, that causes the browser to use UTF-8 when it submits subsequent requests from that page. Make sure that you use a <meta> element in your <head> to specify UTF-8.
-- Martin Cooper
"Paul Barry" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
I think you are correct. When I was looking at the packets and seeing twocharacters, it is actually the characters
that are equal to the 2 bytes that make up the single UTF-8 character. Ithought the browser was somehow not correctly
encoding my data, because it was turning 1 character into 2 characters,but actually it is UTF-8 encoding my character
correctly. So I think if I use something to read the data and convert itfrom UTF-8 to Unicode, I will get the correct
data on the server.way to go, but now my question is how to
So from reading the documentation about FileUpload, that seems to be the
integrate FileUpload with struts? My thought would be to call a method topopulate an ActionForm in the beginning of my
action, and then use that ActionForm instead of the one I get from therequestProcessor. So like this:
there a better way?public ActionForward execute( ActionMapping mapping, ActionForm pform, HttpServletRequest request, HttpServletResponse response) throws Exception { TestActionForm form = getFormUsingFileUpload(request); log.info("The value is: "+form.getTest()); return null; }
Is this how others have used Jakarta Commons FileUpload with Struts, or is
Jason Lea wrote:http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html
From what I can see there Resin is expecting UTF-8 for any paramters passed to it, and decoding it correctly. However multipart/form-data is treated differently as the data is not passed as normal parameters so the request.getParameter() cannot be used here (and servlet filters that set the request encoding won't help either).
You normally have to use something like the FileUpload component to extract form fields and files from the request. This component is not going to know about the character encoding you have given to resin, so it will use the default which is probably US-ASCII. With UTF-8 a single character can be rendered as 1, 2 or 3 bytes. When decoding a UTF-8 string the decoder will combine the 1,2 or 3 byte combinations into 1 Unicode character. When UTF-8 is not used to decode the string you will see the individual bytes.
Looking here (the jakarta apache FileUpload component):
They have a setHeaderEncoding() method which I assume will deal with this problem (I haven't tested this so I don't know). Are you using a file upload component?
Paul Barry wrote:
I am using Struts 1.1 in an application that needs to support the UTF-8 character set. I am using Resin 2.1.10 with character-encoding="UTF-8", and on most of my forms this seems to work just fine. I am having problems with forms that have to use the multipart/form-data enctype for handling uploading files. If I print out the value of a text element in an html:form where the enctype is not set at all (which ends up using application/x-www-form-urlencoded), using UTF-8 characters works fine. This is what I get:
INFO - test.TestAction - The value is: ä
Here is what the actual HTTP request that gets sent to the server looks like:
--- Start HTTP Request ----------------------------------------------------- POST /testForm.do HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */* Referer: http://pbdesktop/test.do Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Host: pbdesktop Content-Length: 11 Connection: Keep-Alive Cache-Control: no-cache Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
test=%C3%AD --- End HTTP Request ------------------------------------------------------
But if I modify my html:form to use enctype="multipart/form-data", I get this:
INFO - test.TestAction - The value is: A¤
And the HTTP request looks like this:
--- Start HTTP Request ----------------------------------------------------- POST /testForm.do HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */* Referer: http://pbdesktop/test.do Accept-Language: en-us Content-Type: multipart/form-data; boundary=---------------------------7d319628600e4 Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Host: pbdesktop Content-Length: 141 Connection: Keep-Alive Cache-Control: no-cache Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
-----------------------------7d319628600e4 Content-Disposition: form-data; name="test"
à -----------------------------7d319628600e4- --- End HTTP Request ------------------------------------------------------
It looks as if the character is already messed up before it even gets to the servlet container. There are messages in the mailing list archive that discuss this problem, but I didn't see a solution. What is the best way to handle UTF-8 characters in a multipart/form-data encoded form?
Here is the code that I am testing with:
/test/test.jsp: <%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
<html:html> <body> <html:form action="testForm.do" enctype="multipart/form-data"> <html:text property="test" /> <html:submit /> </html:form> </body> </html:html>
Relavent parts of struts-config.xml: <struts-config>
<form-beans> <form-bean name="testForm" type="test.TestActionForm" /> </form-beans>
<action-mappings> <action path="/test" type="org.apache.struts.actions.ForwardAction" parameter="/test/test.jsp" /> <action path="/testForm" type="test.TestAction" name="testForm" input="/test.do" scope="request" /> </action-mappings>
<controller contentType="text/html;charset=UTF-8" />
<struts-config/>
test.TestAction: package test;
import javax.servlet.http.*; import org.apache.commons.logging.*; import org.apache.struts.action.*;
public class TestAction extends Action { private static final Log log = LogFactory.getLog(TestAction.class);
public ActionForward execute( ActionMapping mapping, ActionForm pform, HttpServletRequest request, HttpServletResponse response) throws Exception { TestActionForm form = (TestActionForm)pform; log.info("The value is: "+form.getTest()); return null; } }
test.TestActionForm: package test;
import org.apache.struts.action.ActionForm;
public class TestActionForm extends ActionForm { private String test; public String getTest() { return test; } public void setTest(String string) { test = string; } }
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- Jason Lea
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]