The connector URIEncoding defaults to ISO-88591
has its roots in the Multinational Character Set produced for Digital
Equipment Corporation in 85
the original architecture did'nt handle 'other character sets contingency'
mainly as DBCS did'nt figure in until Shift-JIS, GB2312 and Big5 code pages
became more heavily used..

Interesting analysis..
M-
----- Original Message -----
From: "david delbecq" <[EMAIL PROTECTED]>
To: "Tomcat Users List" <users@tomcat.apache.org>
Sent: Thursday, January 10, 2008 4:59 PM
Subject: Re: Tomcat 6 UTF-8 GET and POST - please Help..


> Aleksandar Matijaca a écrit :
> > I will try the filter, and let you know -- I don't think that Struts is
too
> > worried about
> > the encoding, while debugging, I saw that things go wrong much earlier,
> > "above"
> > struts, in the regular servlet layer on which Struts Action sits...
> >
> Well, you saw it using "getParameter" probably, which activates the
> default iso-8859-1 behaviour if no character encoding provided. Now, i
> know struts has a similar code somewhere to default to iso-8859-1 when
> nothing else is specified
> > Thanks for the suggestion..  You know, it is really disappointing, that
one
> > has
> > to do "custom programming" (filter) depending on what kind of a
container
> > engine is
> > used - in the ideal world, this should just work for all different
> > application servers..
> >
> In ideal world it would have been defined in the servlet specs how to
> set default chaset for post decoding :) In ideal world, browser would
> write in their request header what encoding the request was made in :)
> In ideal world, there wouldn't exist more than one, universal, charset.
> In the ideal world, my pay would be bigger ^^
>
> > God only knows, what if anything I would have to do for Websphere,
Weblogic,
> > Oracle server etc...
> >
> Thought top manager of those companies may be near to that position at
> some point, i don't think God has anything to do with it ;)
> > Thanks for your help.
> >
> > Alex.
> >
> >
> > On Jan 10, 2008 2:12 PM, David Delbecq <[EMAIL PROTECTED]> wrote:
> >
> >
> >> As you might infer, URIEncoding="UTF-8" only work on the Uri part, and
> >> thus affect only GET. The post is decoded using the browser provided
> >> "content-encoding". Most borwser don't bother to set that header.
> >> Behaviour of tomcat is to then use some default (ISO-8859-1). Best way
> >> to ensure tomcat will use UTF-8 for post decoding is to force it using
> >> request.setCharacterEncoding("UTF-8"), before any call to
> >> getParameter(s). a filter or a valve might be a good place :)
> >>
> >> btw, be carefull, struts (i see you use struts) tends to also have his
> >> default behaviour when decoding POST :)
> >>
> >>
> >>
> >> Aleksandar Matijaca a écrit :
> >>
> >>> Hi there,
> >>>
> >>> I am having some issue with submitting foreign language fonts in HTML
> >>>
> >> form
> >>
> >>> to Tomcat.
> >>> If I set up the FORM with method="GET" everything works just fine.
> >>>
> >>  However,
> >>
> >>> when I use
> >>> POST, things don't work -  I get garbled characters in my servlet from
> >>>
> >> the
> >>
> >>> FORM.
> >>> In the example below, I am showing the headers for both
> >>> GET and POST - the character being sent is a single lower case letter
> >>>
> >> "f" in
> >>
> >>> cyrillic.
> >>>
> >>>
> >>> ************************************************
> >>>
> >>> The POST method does not work - gives garbled characters - please note
> >>>
> >> that
> >>
> >>> the mytext is PROPERLY ENCODED IN BOTH methods - GET and POST,
> >>> so I don't think that it is a browser issue (i think that my HTML is
ok
> >>> too).
> >>>
> >>>
> >>> http://localhost:8080/Inter/takeText.do
> >>>
> >>> POST /Inter/takeText.do HTTP/1.1
> >>> Host: localhost:8080
> >>> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.11)
> >>> Gecko/20071127 Firefox/2.0.0.11
> >>> Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9
> >>> ,text/plain;q=0.8,image/png,*/*;q=0.5
> >>> Accept-Language: en-us,en;q=0.8,sr;q=0.5,hi;q=0.3
> >>> Accept-Encoding: gzip,deflate
> >>> Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
> >>> Keep-Alive: 300
> >>> Connection: keep-alive
> >>> Referer: http://localhost:8080/Inter/takeText.do
> >>> Cookie: JSESSIONID=yhhvgybm2jfa; sensitivity=1000.0;
> >>> JSESSIONID=AF8D24C63CE9C16D5DF78E5CDDE26146
> >>> Content-Type: application/x-www-form-urlencoded
> >>> Content-Length: 13
> >>> mytext=%D1%84
> >>>
> >>>
> >>> ********************************************************************
> >>>
> >>> This is the request using GET - this works just fine - the servlet
gets
> >>> mytext nicely decoded...
> >>>
> >>>
> >>> http://localhost:8080/Inter/takeText.do?mytext=%D1%84
> >>>
> >>> GET /Inter/takeText.do?mytext=%D1%84 HTTP/1.1
> >>> Host: localhost:8080
> >>> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1.11)
> >>> Gecko/20071127 Firefox/2.0.0.11
> >>> Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9
> >>> ,text/plain;q=0.8,image/png,*/*;q=0.5
> >>> Accept-Language: en-us,en;q=0.8,sr;q=0.5,hi;q=0.3
> >>> Accept-Encoding: gzip,deflate
> >>> Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
> >>> Keep-Alive: 300
> >>> Connection: keep-alive
> >>> Referer: http://localhost:8080/Inter/takeText.do?mytext=%D1%84
> >>> Cookie: JSESSIONID=yhhvgybm2jfa; sensitivity=1000.0;
> >>> JSESSIONID=AF8D24C63CE9C16D5DF78E5CDDE26146
> >>>
> >>>
> >>>
> >>> ***********************************************
> >>> My HTML (rendered from the JSP)
> >>>
> >>>
> >>> <html>
> >>> <head>
> >>>     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"
> >>>
> >> />
> >>
> >>> </head>
> >>> <body>
> >>>             <form name="InputLanguageForm" method="POST"
> >>> action="/Inter/takeText.do" accept-charset="UTF-8" >
> >>>                 <input type="text" name="mytext" value="">
> >>>                 <input type="submit" value="Do it">
> >>>             </form>
> >>> </body>
> >>> </html>
> >>>
> >>> ************************************************
> >>>
> >>> This is the configuration of the Tomcat Connector - if I REMOVE
> >>>
> >> URIEncoding,
> >>
> >>> both GET and POST don't work any more.
> >>>
> >>> <Connector port="8080" protocol="HTTP/1.1"
> >>>                connectionTimeout="20000" URIEncoding="UTF-8"
> >>>                redirectPort="8443" />
> >>>
> >>> ************************************************
> >>>
> >>>
> >>> Thanks for any insight you can give me here...  I really don't want to
> >>> change ALL my forms to GET - there should be
> >>> an easier (and more natural) fix for this.
> >>>
> >>> Thanks, Alex.
> >>>
> >>>
> >>>
> >> ---------------------------------------------------------------------
> >> To start a new topic, e-mail: users@tomcat.apache.org
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >>
> >
> >
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to