Paul Barry wrote:

By using a <meta> element, do you mean this:

<meta http-equiv="Content-Type" content="test/html; charset=utf-8">

That doesn't seem to work when the form is multipart/form-data, because the Content-Type header still just has multipart/form-data. The problem seems to be that when I do a request.getCharacterEncoding(), I get null. Is that normal? I would think I should at least get the default character encoding for the webapp. I am using Resin 2.1.10. This might be an issue for me to report to them.

This definately is what is causing my problem, because if I look at the code in org.apache.struts.upload.CommonsMultipartRequestHandler.addTextParameter(), this is the first thing is does:

        try {
            value = item.getString(request.getCharacterEncoding());
        } catch (Exception e) {
            value = item.getString();
        }

Since request.getCharacterEncoding() is null, I assume an Exception is being throw an caught (a log.warn() might be nice there) and then I am get getting the string without decoding it from UTF-8.

If I manually set the characterEncoding to UTF-8 before this code executes (in processMultipart() in the requestProcessor for example), then everything works fine.

So I guess my question is should I be expecting request.getCharacterEncoding() to return null or is there a bug in my app server?


Nope, it is not a bug. The browser hasn't set the encoding, so the method returns null to indicate this.

Here is the relevant section from Java™ Servlet Specification Version 2.3 (servlet-2_3-fcs-spec.pdf)

"SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the Content-Type header, leaving open the determination of the character encoding for reading HTTP requests. The default encoding of a request the container uses to create the request reader and parse POST data must be “ISO-8859-1”, if none has been specified by the client request. However, in order to indicate to the developer in this case the failure of the client to send a character encoding, the container returns null from the getCharacterEncoding method.


If the client hasn’t set character encoding and the request data is encoded with a different encoding than the default as described above, breakage can occur. To remedy this situation, a new method setCharacterEncoding(String enc) has been added to the ServletRequest interface. Developers can override the character encoding supplied by the container by calling this method. It must be called prior to parsing any post data or reading any input from the request. Calling this method once data has been read will not affect the encoding."






Martin Cooper wrote:



In Struts 1.1, the default file upload mechanism *is* Commons FileUpload.
;-)

It seems that you may have omitted to tell the browser explicitly that your
pages are in UTF-8. For some reason that I've never fully understood, that
causes the browser to use UTF-8 when it submits subsequent requests from
that page. Make sure that you use a <meta> element in your <head> to specify
UTF-8.

--
Martin Cooper


"Paul Barry" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]



I think you are correct. When I was looking at the packets and seeing two


characters, it is actually the characters



that are equal to the 2 bytes that make up the single UTF-8 character. I


thought the browser was somehow not correctly



encoding my data, because it was turning 1 character into 2 characters,


but actually it is UTF-8 encoding my character



correctly. So I think if I use something to read the data and convert it


from UTF-8 to Unicode, I will get the correct



data on the server.

So from reading the documentation about FileUpload, that seems to be the


way to go, but now my question is how to



integrate FileUpload with struts? My thought would be to call a method to


populate an ActionForm in the beginning of my



action, and then use that ActionForm instead of the one I get from the


requestProcessor. So like this:



    public ActionForward execute(
            ActionMapping mapping,
            ActionForm pform,
            HttpServletRequest request,
            HttpServletResponse response)
            throws Exception {
        TestActionForm form = getFormUsingFileUpload(request);
        log.info("The value is: "+form.getTest());
        return null;
   }

Is this how others have used Jakarta Commons FileUpload with Struts, or is


there a better way?



Jason Lea wrote:




From what I can see there Resin is expecting UTF-8 for any paramters
passed to it, and decoding it correctly.  However multipart/form-data is
treated differently as the data is not passed as normal parameters so
the request.getParameter() cannot be used here (and servlet filters that
set the request encoding won't help either).

You normally have to use something like the FileUpload component to
extract form fields and files from the request.  This component is not
going to know about the character encoding you have given to resin, so
it will use the default which is probably US-ASCII.  With UTF-8 a single
character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8
string the decoder will combine the 1,2 or 3 byte combinations into 1
Unicode character.  When UTF-8 is not used to decode the string you will
see the individual bytes.

Looking here (the jakarta apache FileUpload component):



http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html



They have a setHeaderEncoding() method which I assume will deal with
this problem (I haven't tested this so I don't know).  Are you using a
file upload component?


Paul Barry wrote:





I am using Struts 1.1 in an application that needs to support the
UTF-8 character set.  I am using Resin 2.1.10 with
character-encoding="UTF-8", and on most of my forms this seems to work
just fine.  I am having problems with forms that have to use the
multipart/form-data enctype for handling uploading files.  If I print
out the value of a text element in an html:form where the enctype is
not set at all (which ends up using
application/x-www-form-urlencoded), using UTF-8 characters works
fine.  This is what I get:

INFO - test.TestAction - The value is: ä

Here is what the actual HTTP request that gets sent to the server
looks like:

--- Start HTTP Request
-----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 11
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

test=%C3%AD
--- End HTTP Request
------------------------------------------------------

But if I modify my html:form to use enctype="multipart/form-data", I
get this:

INFO - test.TestAction - The value is: A¤

And the HTTP request looks like this:

--- Start HTTP Request
-----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: multipart/form-data;
boundary=---------------------------7d319628600e4
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 141
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

-----------------------------7d319628600e4
Content-Disposition: form-data; name="test"

í
-----------------------------7d319628600e4-
--- End HTTP Request
------------------------------------------------------

It looks as if the character is already messed up before it even gets
to the servlet container.  There are messages in the mailing list
archive that discuss this problem, but I didn't see a solution.  What
is the best way to handle UTF-8 characters in a multipart/form-data
encoded form?

Here is the code that I am testing with:

/test/test.jsp:
<%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
<%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>

<html:html>
<body>
  <html:form action="testForm.do" enctype="multipart/form-data">
    <html:text property="test" />
    <html:submit />
  </html:form>
</body>
</html:html>

Relavent parts of struts-config.xml:
<struts-config>

<form-beans>
  <form-bean name="testForm" type="test.TestActionForm" />
</form-beans>

<action-mappings>
  <action path="/test"
type="org.apache.struts.actions.ForwardAction"
parameter="/test/test.jsp" />
  <action path="/testForm" type="test.TestAction" name="testForm"
input="/test.do" scope="request" />
</action-mappings>

<controller contentType="text/html;charset=UTF-8" />

<struts-config/>

test.TestAction:
package test;

import javax.servlet.http.*;
import org.apache.commons.logging.*;
import org.apache.struts.action.*;

public class TestAction extends Action {
  private static final Log log = LogFactory.getLog(TestAction.class);

  public ActionForward execute(
          ActionMapping mapping,
          ActionForm pform,
          HttpServletRequest request,
          HttpServletResponse response)
          throws Exception {
      TestActionForm form = (TestActionForm)pform;
      log.info("The value is: "+form.getTest());
      return null;
  }
}

test.TestActionForm:
package test;

import org.apache.struts.action.ActionForm;

public class TestActionForm extends ActionForm {
  private String test;
  public String getTest() { return test;     }
  public void setTest(String string) { test = string; }
}


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]










---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






--
Jason Lea




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to