Re: URIEncoding
Thanks alot for your help. Yesterday I mentioned that the problem is solved but strange I think my messages is not got by the forum. The problem was using the : meta http-equiv='Content-Type' content='text/html; charset=UTF-8' To change a encoding of a jsp page which is diudn’t work. I simply forget that I am chanign the encoding of a jsp page and not html page. I used the right command for a jsp page which is : %@ page contentType=text/html;charset=UTF-8 % It works fine and force the page to use utf-8 encoding. Thanks once again for your kind response. awarnier wrote: starz10de wrote: One of my try to solve the problem is to use utf-8 in my html page as well as in my backend. It doesn’t work because the reason was that the Browser change automatically to iso encoding. Today I checked the browser encoding before submit the query and saw that it use iso although in my html page it is utf-8. I changed to utf-8 manually and submit the query and it works fine. Good. Now you are providing some real information. I a login page (it has utf-8) I checked this also manually; after login is successful a jsp page will be called to enter the query. This jsp page use iso encoding although inside it utf-8 is defined. I couldn’t understand from why the browser automatically user iso encoding although I force it to use utf-8. Neither do I, but let's find out. Here how I do: meta http-equiv='Content-Type' content='text/html; charset=UTF-8' That looks correct. Which browser is it ? Did you get one of these plugins that I recommended ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org -- View this message in context: http://old.nabble.com/URIEncoding-tp32989250p33005252.html Sent from the Tomcat - User mailing list archive at Nabble.com. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, On 12/17/11 9:37 AM, André Warnier wrote: I do not see anything in the above that submits anything with an umlaut. The form could have additional inputs that were not included in the OP. This is a GET request, so anything submitted would have to be in the URL, as a query-string. Most web browsers are smart enough to merge the query string from the form action and form the inputs though I found dealt with an issue recently where that didn't work properly. I can't remember what it was... probably Safari. I only see name here. The quotes appear wrong too. +1 There is also a double // after 8080, where it should not be. It shouldn't be there, but it also shouldn't be a problem. It's just sloppy. Are you sure it is not simply the action of your form which is wrong ? If the above bug (failure to merge query-string and form input parameters) is the problem, then the OP might end up with 'null' being read from the request -- IIRC that was my observation. The solution IMO is to always use POST whenever you are expecting non-US-ASCII in your form inputs. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7vvNUACgkQ9CaO5/Lv0PAl/gCfbFlZ7mulTmeKrEr92WxIMh9+ QGoAoKjNZulYD0kibYrShP85dp2pZsxL =fecP -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 German Starz, On 12/18/11 7:26 AM, starz10de wrote: I a login page (it has utf-8) I checked this also manually; after login is successful a jsp page will be called to enter the query. This jsp page use iso encoding although inside it utf-8 is defined. Remember that the encoding of the response and the encoding of the page are separate issues. If the page has been saved to the disk as ISO-8859-1, the response can still be sent using UTF-8. I couldn’t understand from why the browser automatically user iso encoding although I force it to use utf-8. Here how I do: meta http-equiv='Content-Type' content='text/html; charset=UTF-8' You should also be using: %@page pageEncoding=UTF-8 ... % or %@page contentType=text/html; charset=UTF-8 ... % I usually rig the JSP to grab the content type and charset directly from the response, instead of hard-coding them. That way, the will always be in sync: meta http-equiv=Content-Type content=$response.contentType / (You /are/ using XHTML, right?) Note that your response must have a definite character encoding set, otherwise the default default default will be ISO-8859-1 and might not be displayed in your Content-Type META tag. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7vvp4ACgkQ9CaO5/Lv0PCxlgCgpf72A/Zr+XAd2q5UwdhFUF4v M4MAoIeIqcXZevsN16jeyfi6IlMjF+NP =D1A2 -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
One of my try to solve the problem is to use utf-8 in my html page as well as in my backend. It doesn’t work because the reason was that the Browser change automatically to iso encoding. Today I checked the browser encoding before submit the query and saw that it use iso although in my html page it is utf-8. I changed to utf-8 manually and submit the query and it works fine. I a login page (it has utf-8) I checked this also manually; after login is successful a jsp page will be called to enter the query. This jsp page use iso encoding although inside it utf-8 is defined. I couldn’t understand from why the browser automatically user iso encoding although I force it to use utf-8. Here how I do: meta http-equiv='Content-Type' content='text/html; charset=UTF-8' starz10de wrote: I have an application which is running in local machine and it work perfect. I installed my application in the server to make it available for all. In the server we have tomcat running and provide services for many instances. After I played my application in the server, I had problem with query which have special language character. After long time, I could find where is the problem. The problem was in server.xml where the URIEncoding is set to UTF-8. I made test and just removed this line or set it to ISO-8859-1 and all was perfect. My question here is it possible to set the URIEncoding for each instance or is it possible to set it some where else. I send the query from jsp page to the servlet. in my jsp page the charset=ISO-8859-1. I tried to make all utf-8 but I couldn't success. I tried the filter approach but also doesn't help: filter filter-nameSet Character Encoding/filter-name filter-classservlet.CharsetFilter/filter-class init-param param-nameencoding/param-name param-valueISO-8859-1/param-value /init-param /filter !-- Define filter mappings for the defined filters -- filter-mapping filter-nameSet Character Encoding/filter-name servlet-nameaction/servlet-name /filter-mapping Any hint will be appreciated. -- View this message in context: http://old.nabble.com/URIEncoding-tp32989250p32997524.html Sent from the Tomcat - User mailing list archive at Nabble.com. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
starz10de wrote: I tried your suggestion: http://localhost:8080//search/main?query=böse (I make sure that it is inside some documents) but the search return null Remember : I have no idea what your application is, and what it is supposed to do or not. Only you know that. With the limited information you have provided so far, I can say this : here what I saw in the browser link bar after click on submit: http://localhost:8080//search/main?query=b%F6se This ---^ shows that the browser is sending the query-string as ISO-8859-1. (the single code %F6 is the URL-encoded version of the byte \xF6, which is ö in the iso-8859-1 encoding.) If the browser was sending the query-string as UTF-8, then you would see http://localhost:8080//search/main?query=b%C3%B6se because the character ö, in UTF-8-encoded Unicode , is represented by the two bytes \xC3\xB6. Remember : you must *first* look at what the browser thinks about the page which contains the search link. When that page is displayed in your browser, right-click on that page (or use the File menu), ask for page information, look at character set or encoding (Kodierung ?) If this page is seen as ISO-8859-1, then when you click on the search link in the page, the browser will send the request to Tomcat as ISO-8859-1. If this page is seen as UTF-8, then the browser will send the request to the server as UTF-8. If the server receives the request in one encoding, but is expecting another, then it will not understand the request properly, and (in your application) probably not find anything. In other words : - if you tell the server : URIencoding=UTF-8 but the requests come in as ISO-8859-1, it will not work. - if you tell the server : URIencoding=ISO-8859-1 but the requests come in as UTF-8, it will not work. - if both match, it should work. - if you tell the server nothing, it will default to ISO-8859-1. The above is all assuming that the requests are really sent as GET, thus with all parameters in the URI query-string. (That is what your original form tag seemed to indicate; but as you typed it or pasted it, your form tag looked incorrect.) I also changed the browser encode to ISO but I couldn't retrieve any thing any ideas? To contribute any more ideas, I need more information. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
starz10de wrote: One of my try to solve the problem is to use utf-8 in my html page as well as in my backend. It doesn’t work because the reason was that the Browser change automatically to iso encoding. Today I checked the browser encoding before submit the query and saw that it use iso although in my html page it is utf-8. I changed to utf-8 manually and submit the query and it works fine. Good. Now you are providing some real information. I a login page (it has utf-8) I checked this also manually; after login is successful a jsp page will be called to enter the query. This jsp page use iso encoding although inside it utf-8 is defined. I couldn’t understand from why the browser automatically user iso encoding although I force it to use utf-8. Neither do I, but let's find out. Here how I do: meta http-equiv='Content-Type' content='text/html; charset=UTF-8' That looks correct. Which browser is it ? Did you get one of these plugins that I recommended ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
Thanks a lot for your answer. I already did what you suggested: meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1 / but unfortunately the same problem. As I said when the default in the server.xml is ISO-8859-1 all are fine. I am dealing with English and German languages. The problem in the umlaut, when I submit it to the servlet it is not recognized. here where I submit my request: form action=http://localhost:8080//Search/main?name method=get TARGET=result any more hint? awarnier wrote: starz10de wrote: I have an application which is running in local machine and it work perfect. I installed my application in the server to make it available for all. In the server we have tomcat running and provide services for many instances. After I played my application in the server, I had problem with query which have special language character. After long time, I could find where is the problem. The problem was in server.xml where the URIEncoding is set to UTF-8. I made test and just removed this line or set it to ISO-8859-1 and all was perfect. My question here is it possible to set the URIEncoding for each instance or is it possible to set it some where else. I send the query from jsp page to the servlet. in my jsp page the charset=ISO-8859-1. I tried to make all utf-8 but I couldn't success. I tried the filter approach but also doesn't help: filter filter-nameSet Character Encoding/filter-name filter-classservlet.CharsetFilter/filter-class init-param param-nameencoding/param-name param-valueISO-8859-1/param-value /init-param /filter !-- Define filter mappings for the defined filters -- filter-mapping filter-nameSet Character Encoding/filter-name servlet-nameaction/servlet-name /filter-mapping Any hint will be appreciated. Hi. 1) By default, under HTTP (and HTML), the character set is ISO-8859-1. So, if you do not specify anything anywhere to say something else, everything should be understood and processed as ISO-8859-1. (**) 2) When a browser submits the contents of a form to a server, it will /generally/ use the same character set, as the one which /it thinks/ is the character set of the *current* page (the one which is currently shown on the screen == the one which contains the link or button which will send data to the server). So, what you need to do, is to look in the browser in the Page info or similar, which character set the browser believes is in effect for the current page. (*) 3) Normally also, this character set will be the one which, in the page source, is indicated by the following tag : meta http-equiv=content-type content=text/html; charset=X / (it is the X above) So make sure that all the pages that you send to the browser contain such a tag, with the correct character set. 4) Thus, if your pages are UTF-8, then any link in the page which calls the server, is going to send all values to the server in the UTF-8 character set. That includes the query-string part of URLs, and also the POST parameters which may be sent. If that is the case, you need to tell the server that it is so, because that is /not/ the default for HTTP. So that is when you should use the URIencoding parameter : if your forms are sending requests to the server containing a query-string. 5) if your forms are sending values by means of POST requests, then the situation gets more complicated, if you use a character set other than ISO-8859-1. But let's leave that for the next time. A question maybe, for later : what is/are the (human) language(s) that are used on your pages ? (*) I also /strongly/ advise, for issues of that nature, that you get a browser plug-in such as HttpFox or similar (for Firefox) or Fiddler2 (for Internet Explorer), to be able to check exactly what is being sent from the browser to the server and vice-versa. (**) Unfortunately, in Java the internal representation for characters and strings is Unicode, which can lead to mixups if you are not careful. Or, let me turn this around : it is much better to use Unicode as a character set, than any other alphabet. But unfortunately, in the WWW, for historical reasons, the default is still ISO-8859-1, which creates many problems when one tries to deal with non-English languages. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org -- View this message in context: http://old.nabble.com/URIEncoding-tp32989250p32991998.html Sent from the Tomcat - User mailing list archive at Nabble.com. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
starz10de wrote: Thanks a lot for your answer. I already did what you suggested: meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1 / That's good. but unfortunately the same problem. As I said when the default in the server.xml is ISO-8859-1 all are fine. Can you show *exactly* what you are doing in server.xml ? (paste the relevant portion here, remove comments and passwords) I am dealing with English and German languages. The problem in the umlaut, when I submit it to the servlet it is not recognized. here where I submit my request: form action=http://localhost:8080//Search/main?name method=get TARGET=result I do not see anything in the above that submits anything with an umlaut. This is a GET request, so anything submitted would have to be in the URL, as a query-string. I only see name here. The quotes appear wrong too. There is also a double // after 8080, where it should not be. Are you sure it is not simply the action of your form which is wrong ? What are the input fields being submitted in your form, and what value do you put in it/them ? Try the following, directly in your browser's URL bar : http://localhost:8080/Search/main?name=böse zeichen (note the ö-umlaut) What does that do ? any more hint? awarnier wrote: starz10de wrote: I have an application which is running in local machine and it work perfect. I installed my application in the server to make it available for all. In the server we have tomcat running and provide services for many instances. After I played my application in the server, I had problem with query which have special language character. After long time, I could find where is the problem. The problem was in server.xml where the URIEncoding is set to UTF-8. I made test and just removed this line or set it to ISO-8859-1 and all was perfect. My question here is it possible to set the URIEncoding for each instance or is it possible to set it some where else. I send the query from jsp page to the servlet. in my jsp page the charset=ISO-8859-1. I tried to make all utf-8 but I couldn't success. I tried the filter approach but also doesn't help: filter filter-nameSet Character Encoding/filter-name filter-classservlet.CharsetFilter/filter-class init-param param-nameencoding/param-name param-valueISO-8859-1/param-value /init-param /filter !-- Define filter mappings for the defined filters -- filter-mapping filter-nameSet Character Encoding/filter-name servlet-nameaction/servlet-name /filter-mapping Any hint will be appreciated. Hi. 1) By default, under HTTP (and HTML), the character set is ISO-8859-1. So, if you do not specify anything anywhere to say something else, everything should be understood and processed as ISO-8859-1. (**) 2) When a browser submits the contents of a form to a server, it will /generally/ use the same character set, as the one which /it thinks/ is the character set of the *current* page (the one which is currently shown on the screen == the one which contains the link or button which will send data to the server). So, what you need to do, is to look in the browser in the Page info or similar, which character set the browser believes is in effect for the current page. (*) 3) Normally also, this character set will be the one which, in the page source, is indicated by the following tag : meta http-equiv=content-type content=text/html; charset=X / (it is the X above) So make sure that all the pages that you send to the browser contain such a tag, with the correct character set. 4) Thus, if your pages are UTF-8, then any link in the page which calls the server, is going to send all values to the server in the UTF-8 character set. That includes the query-string part of URLs, and also the POST parameters which may be sent. If that is the case, you need to tell the server that it is so, because that is /not/ the default for HTTP. So that is when you should use the URIencoding parameter : if your forms are sending requests to the server containing a query-string. 5) if your forms are sending values by means of POST requests, then the situation gets more complicated, if you use a character set other than ISO-8859-1. But let's leave that for the next time. A question maybe, for later : what is/are the (human) language(s) that are used on your pages ? (*) I also /strongly/ advise, for issues of that nature, that you get a browser plug-in such as HttpFox or similar (for Firefox) or Fiddler2 (for Internet Explorer), to be able to check exactly what is being sent from the browser to the server and vice-versa. (**) Unfortunately, in Java the internal representation for characters and strings is Unicode, which can lead to mixups if you are not careful. Or, let me turn this around : it is much better to use Unicode as a character set, than any other alphabet. But unfortunately, in the WWW, for historical reasons, the default is still ISO-8859-1, which creates
Re: URIEncoding
awarnier wrote: Can you show *exactly* what you are doing in server.xml ? (paste the relevant portion here, remove comments and passwords) For the server.xml I can't modify any thing there, as I mentioned before there are many other instances working there. If I can modify then I will just kick the 'utf-8 from there, with the default value I have no problem. I do not see anything in the above that submits anything with an umlaut. This is a GET request, so anything submitted would have to be in the URL, as a query-string. I only see name here. The quotes appear wrong too. There is also a double // after 8080, where it should not be. Are you sure it is not simply the action of your form which is wrong ? What are the input fields being submitted in your form, and what value do you put in it/them ? It works fine and nothing wrong on it, at least I could submit the query to the servlet and get the result back. Here how I send the reques: form action=http://localhost:8080//search/main?name method=get TARGET=Welcome input maxlength=4000 size=40 name=query input type=submit value= Search style=background-color:#F8F8FF; color: black; Try the following, directly in your browser's URL bar : http://localhost:8080/Search/main?name=böse zeichen (note the ö-umlaut) What does that do ? it doesn't work because when I submit the query I see just the main menu jsp in the link bar. Thanks for your response, is there any problem in what I mentioned? -- View this message in context: http://old.nabble.com/URIEncoding-tp32989250p32994524.html Sent from the Tomcat - User mailing list archive at Nabble.com. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
I tried your suggestion: http://localhost:8080//search/main?query=böse (I make sure that it is inside some documents) but the search return null here what I saw in the browser link bar after click on submit: http://localhost:8080//search/main?query=b%F6se I also changed the browser encode to ISO but I couldn't retrieve any thing any ideas? starz10de wrote: awarnier wrote: Can you show *exactly* what you are doing in server.xml ? (paste the relevant portion here, remove comments and passwords) For the server.xml I can't modify any thing there, as I mentioned before there are many other instances working there. If I can modify then I will just kick the 'utf-8 from there, with the default value I have no problem. I do not see anything in the above that submits anything with an umlaut. This is a GET request, so anything submitted would have to be in the URL, as a query-string. I only see name here. The quotes appear wrong too. There is also a double // after 8080, where it should not be. Are you sure it is not simply the action of your form which is wrong ? What are the input fields being submitted in your form, and what value do you put in it/them ? It works fine and nothing wrong on it, at least I could submit the query to the servlet and get the result back. Here how I send the reques: form action=http://localhost:8080//search/main?name method=get TARGET=Welcome input maxlength=4000 size=40 name=query input type=submit value= Search style=background-color:#F8F8FF; color: black; Try the following, directly in your browser's URL bar : http://localhost:8080/Search/main?name=böse zeichen (note the ö-umlaut) What does that do ? it doesn't work because when I submit the query I see just the main menu jsp in the link bar. Thanks for your response, is there any problem in what I mentioned? -- View this message in context: http://old.nabble.com/URIEncoding-tp32989250p32994793.html Sent from the Tomcat - User mailing list archive at Nabble.com. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
starz10de wrote: I have an application which is running in local machine and it work perfect. I installed my application in the server to make it available for all. In the server we have tomcat running and provide services for many instances. After I played my application in the server, I had problem with query which have special language character. After long time, I could find where is the problem. The problem was in server.xml where the URIEncoding is set to UTF-8. I made test and just removed this line or set it to ISO-8859-1 and all was perfect. My question here is it possible to set the URIEncoding for each instance or is it possible to set it some where else. I send the query from jsp page to the servlet. in my jsp page the charset=ISO-8859-1. I tried to make all utf-8 but I couldn't success. I tried the filter approach but also doesn't help: filter filter-nameSet Character Encoding/filter-name filter-classservlet.CharsetFilter/filter-class init-param param-nameencoding/param-name param-valueISO-8859-1/param-value /init-param /filter !-- Define filter mappings for the defined filters -- filter-mapping filter-nameSet Character Encoding/filter-name servlet-nameaction/servlet-name /filter-mapping Any hint will be appreciated. Hi. 1) By default, under HTTP (and HTML), the character set is ISO-8859-1. So, if you do not specify anything anywhere to say something else, everything should be understood and processed as ISO-8859-1. (**) 2) When a browser submits the contents of a form to a server, it will /generally/ use the same character set, as the one which /it thinks/ is the character set of the *current* page (the one which is currently shown on the screen == the one which contains the link or button which will send data to the server). So, what you need to do, is to look in the browser in the Page info or similar, which character set the browser believes is in effect for the current page. (*) 3) Normally also, this character set will be the one which, in the page source, is indicated by the following tag : meta http-equiv=content-type content=text/html; charset=X / (it is the X above) So make sure that all the pages that you send to the browser contain such a tag, with the correct character set. 4) Thus, if your pages are UTF-8, then any link in the page which calls the server, is going to send all values to the server in the UTF-8 character set. That includes the query-string part of URLs, and also the POST parameters which may be sent. If that is the case, you need to tell the server that it is so, because that is /not/ the default for HTTP. So that is when you should use the URIencoding parameter : if your forms are sending requests to the server containing a query-string. 5) if your forms are sending values by means of POST requests, then the situation gets more complicated, if you use a character set other than ISO-8859-1. But let's leave that for the next time. A question maybe, for later : what is/are the (human) language(s) that are used on your pages ? (*) I also /strongly/ advise, for issues of that nature, that you get a browser plug-in such as HttpFox or similar (for Firefox) or Fiddler2 (for Internet Explorer), to be able to check exactly what is being sent from the browser to the server and vice-versa. (**) Unfortunately, in Java the internal representation for characters and strings is Unicode, which can lead to mixups if you are not careful. Or, let me turn this around : it is much better to use Unicode as a character set, than any other alphabet. But unfortunately, in the WWW, for historical reasons, the default is still ISO-8859-1, which creates many problems when one tries to deal with non-English languages. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, On 12/16/11 3:37 PM, André Warnier wrote: 4) Thus, if your pages are UTF-8, then any link in the page which calls the server, is going to send all values to the server in the UTF-8 character set. I'm not so sure about that. Firefox has a setting for sending URLs in UTF-8, and I suspect that that will override any in-page setting. Curiously, there appears to be a separate setting for encoding of the *query string* in UTF-8, and the default is false which I suspect results in the behavior you have outlined above. Basically, you should always test everything :) - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7ryVoACgkQ9CaO5/Lv0PBk5gCgqIXBx65DZDA9PkVJ142ob9qQ 3RQAn1If05tUjeX0XUViDLFkixSbWBwo =iPir -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding problem ver 6.0.29 64 bit on windows
Sorry, it was my mistake. On 64 bit version webapp name was MyWebApp On 32 bit version webapp name was mywebapp On Fri, Dec 10, 2010 at 13:54, imrezol imre...@gmail.com wrote: Hi! I set URIEncoding=UTF-8 at http connector in server.xml: Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 redirectPort=8443 URIEncoding=UTF-8/ restart tomcat service and after that type an url with special charaters like this: http://localhost:8080/mywebapp/docs/Felhasználói_kézikönyv.pdfhttp://localhost:8080/mywebapp/docs/Felhaszn%C3%A1l%C3%B3i_k%C3%A9zik%C3%B6nyv.pdf and I get a 404 error message: *description* *The requested resource (/*mywebapp*/*docs*/Felhaszn%C3%A1l%C3%B3i_k%C3%A9zik%C3%B6nyv.pdf) is not available.* If I do the same settings with 6.0.29 32 bit verison, it's work fine. Thank: Zoltan Imre* * * *
Re: URIEncoding problem ver 6.0.29 64 bit on windows
imrezol wrote: Hi! I set URIEncoding=UTF-8 at http connector in server.xml: Connector port=8080 protocol=HTTP/1.1 connectionTimeout=2 redirectPort=8443 URIEncoding=UTF-8/ restart tomcat service and after that type an url with special charaters like this: http://localhost:8080/mywebapp/docs/Felhasználói_kézikönyv.pdfhttp://localhost:8080/mywebapp/docs/Felhaszn%C3%A1l%C3%B3i_k%C3%A9zik%C3%B6nyv.pdf and I get a 404 error message: *description* *The requested resource (/*mywebapp*/*docs*/Felhaszn%C3%A1l%C3%B3i_k%C3%A9zik%C3%B6nyv.pdf) is not available.* If I do the same settings with 6.0.29 32 bit verison, it's work fine. Hi. That's an interesting question. Before we tackle this, can you answer this question, precisely : When you mention trying this with a 32-bit and a 64-bit version of Tomcat, do you mean that you tried this on the exact same machine, with the same version of Tomcat, and with the exact same .pdf file under /mywebapp ? (I am quoting of Tomcat above, because Tomcat itself would be the same, being Java code. It is the JVM which is different)(which may already give a hint). I consider the question interesting, because personally, I have always found this URIEncoding attribute rather questionable, in the HTTP and URI RFC sense. As I understand the specs, this attribute should not exist. A HTTP server should take the URI as it comes, decode it as per the uri-encoding/decoding scheme, and then just use the *byte* result 'as is' to locate the resource on disk. It should not second-guess the client's intent. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: URIEncoding UTF-16 problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, André Warnier wrote: The OP is talking about UTF-16, not UTF-8. I understand. I was trying to contrast UTF-8 and UTF-16, apparently unsuccessfully. What you are saying above about ASCII/UTF-8 is true, if one restricts oneself to strictly the 7-bit US-ASCII. ASCII is not US-specific, and everything above 127 has pretty much always been non-standard. So, yes, I was talking about 7-bit ASCII. That'ok for English, but not OK for mostly any other language on this planet. It handles English, German, and Latin languages. It does not handle many others. Again, I was trying to point out that if you set your server to UTF-8 and the client is expecting ASCII, then there is no problem (and I believe this is a common case). Most clients these days use UTF-8 by default, so you're safe that way, too. Nobody really uses UTF-16 on clients by default so by setting your URI encoding to UTF-16 basically means that nobody will ever be able to successfully contact your server unless they know beforehand that UTF-16 should be used. The default charset on the Web is iso-8859-1 (latin-1), not US-ASCII. The first 127 characters of (US-)ASCII, UTF-8, and ISO-8859-1 are identical. Again, you're covered. Now about the first request bit : not on the first request, nor on any subsequent request, unless the server finds a way to tell the application that it only accepts requests with URI's encoded as UTF-16, and the browser not only understands the instruction, but obeys it. Most clients will use the content encoding of the previous response for the URI of the next request. At least, that has been my experience. So, back to the original question : why set the connector to UTF-16 URI encoding ? That will almost guarantee that Tomcat will not properly understand any URL requested by a standard browser. Exactly my point: why use UTF-16 when you can use UTF-8, get all the benefits of oodles of characters, /and/ conform to the expectations of nearly every client out on the Internet? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkiphkYACgkQ9CaO5/Lv0PB5SgCbBYRUSIETPriBdLCn4KM/i7Jg VEMAnjvVm8nc7G8TEeXxRlDMfsuwGbyA =qu8D -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding UTF-16 problem
Nayyer Kamran wrote: Hello, I am facing problem in accessing deployed modules once I configured connector's URIEncoding to UTF-16 in server.xml. Hi. Could you tell us *why* exactly you did set this attribute ? It is rather unusual, as it supposes that you expect all clients to encode their requested URI's in UTF-16 prior to sending the request to Tomcat on that connector. To my knowledge, no standard client (browser) will ever do so. Also, do you really known what it means ? UTF-16 is a Unicode encoding where each character occupies 2 bytes (16 bits). For most of the Western and Eastern European alphabetic characters, this results in a byte 0 followed by a non-zero byte. That probably explains why Tomcat is not recognising any of the URLs that you try to access, and giving 404 errors all the time. It's just that the URI as Tomcat sees it, never matches any of your webapps. Setting the URI encoding differently from the default normally supposes that the two sides (client and server) agree on an alternative encoding for the URI's. You cannot just do it on one side and not on the other. André - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding UTF-16 problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, André Warnier wrote: Could you tell us *why* exactly you [are trying to use UTF-16]? It is rather unusual, as it supposes that you expect all clients to encode their requested URI's in UTF-16 prior to sending the request to Tomcat on that connector. To my knowledge, no standard client (browser) will ever do so. ...at least not on the first request. The beauty of using an encoding like UTF-8 is that ASCII is a strict subset: any plain-old ASCII request can be interpreted as a UTF-8 request, which means that if you want to use UTF-8 on your site, but your visitors come in using ASCII, there's no problem (unless they have weird characters in their first request, which is rare). - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkilnPIACgkQ9CaO5/Lv0PCGAACfbQ104lWe+PbiZG1/O8yVYtu2 RXAAnAq8j9ta6m80E5zmRN2WLuFukxaj =f/2i -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding UTF-16 problem
Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, André Warnier wrote: Could you tell us *why* exactly you [are trying to use UTF-16]? It is rather unusual, as it supposes that you expect all clients to encode their requested URI's in UTF-16 prior to sending the request to Tomcat on that connector. To my knowledge, no standard client (browser) will ever do so. ...at least not on the first request. The beauty of using an encoding like UTF-8 is that ASCII is a strict subset: any plain-old ASCII request can be interpreted as a UTF-8 request, which means that if you want to use UTF-8 on your site, but your visitors come in using ASCII, there's no problem (unless they have weird characters in their first request, which is rare). The OP is talking about UTF-16, not UTF-8. What you are saing above about ASCII/UTF-8 is true, if one restricts oneself to strictly the 7-bit US-ASCII. That'ok for English, but not OK for mostly any other language on this planet. The default charset on the Web is iso-8859-1 (latin-1), not US-ASCII. Any character of iso-8859-1 whose codepoint is above 128 decimal does not encode as a single byte in UTF-8. My own name, expressed in the Unicode alphabet and encoded in UTF-8, occupies 6 bytes, not 7. Encoded as UTF-16, it occupies 12 bytes, half of which have a hex value of 00. Now about the first request bit : not on the first request, nor on any subsequent request, unless the server finds a way to tell the application that it only accepts requests with URI's encoded as UTF-16, and the browser not only understands the instruction, but obeys it. If there is an accepted and supported way to do that, I'd be glad to hear it, as it would solve a lot of practical web internationali(z/s)ation problems. So, back to the original question : why set the connector to UTF-16 URI encoding ? That will almost guarantee that Tomcat will not properly understand any URL requested by a standard browser. André - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding UTF-16 problem
Replying to myself : André Warnier wrote: My own name, expressed in the Unicode alphabet and encoded in UTF-8, occupies 6 bytes, not 7. I meant 6 bytes, not 5, of course. It rather weakens my argument when I mix-up my own byte counts... - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Thx a lot Pulkit, it works just fine. But my aim is to make portability easier ; What happens if the URIEncoding of the Connector is not UTF-8 or ISO-8859-1, but a different char encoding ? Your pseudo-code won't work anymore :( Is there a way to get the value of the param URIEncoding of the Connector, so your code will work, whatever the char encoding of the Connector is ? Pulkit Singhal a écrit : How about: String queryString = HttpServletRequest.getParameter(query); queryString = new String(queryString.getBytes(iso-8859-1), UTF-8); Its not very graceful so you can even make a 1-line-method for doing this and have: decodeURIParams(a, b, c) { return new String((HttpServletRequest.getParameter(a)).getBytes(b), c); } String queryString = decodeURIParams(query, URI_ENCODING_CONST, URI_DECODING_CONST)); This is all pseudo-code but I hope you see what I mean. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi Pulkit, thanks for your answer. The matter is that Tomcat won't get the correct values of the parameters in the URL. For instance : If my URI looks like : http://host/?query=%C3%A9%C3%A8 The URI encoding is UTF-8 By default, Tomcat will read this url in ISO-8859-1. So HttpServletRequest.getParameter(query) will return an incorrect value. The solution you proposed won't help Tomcat to return a correct value with the getParameter method. If I add into server.xml the attribut URIEncoding=UTF-8 to the Connector, Tomcat will correctly read the query parameter. I would like Tomcat to read correctly URL in UTF-8, but without modifying server.xml. Any suggestion ? Pulkit Singhal a écrit : Hi Frederic, I don't know about HttpSession.method for settign the URIEncoding. But you could always do somethign along the lines of: String uri_utf8 = new String (uri.getBytes(iso-8859-1), UTF-8); inside the application. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi folks :) I need my URI to be in UTF-8. In server.xml, I added to the Connector the attribut : URIEncoding=UTF-8 This works well. But my question is : Is there a way to define the URIEncoding in the application itself ? For instance, you can modify the session timeout in the application itself (HttpSession.setMaxInactiveInterval()). I would like to modify the URIEncoding by the same way. Would anyone know how to achieve that ? Thanks. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Frederic Bastian, PhD student Department of Ecology and Evolution Biophore, University of Lausanne, 1015 Lausanne, Switzerland. tel: +41 21 692 4221 http://www.unil.ch/dee/page22707.html Swiss Institute of Bioinformatics http://www.isb-sib.ch/ - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Frederic Bastian, PhD student Department of Ecology and Evolution Biophore, University of Lausanne, 1015 Lausanne, Switzerland. tel: +41 21 692 4221 http://www.unil.ch/dee/page22707.html Swiss Institute of Bioinformatics http://www.isb-sib.ch/ - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frederic, Frederic Bastian wrote: The point is that I need to use the java.net.URLEncoder.encode() method, e.g. java.net.URLEncoder.encode(myParam, UTF-8). You ought to be using the response's character encoding, not whatever Tomcat (or the browser) is using for URIEncoding. You want to do this: java.net.URLEncoder.encode(myParam, request.getCharacterEncoding()); Or, you could do what everybody else in the world does and use a tag library or some other tool to emit URLs including parameters, etc. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqPMB9CaO5/Lv0PARAiyhAJ4zYvszzqRzArWTfHxMIIWN3sU5aQCfRlBs g5X8A0Fh88S5Mrmii+Ylg8g= =y7Mj -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Christopher Schultz a écrit : You want to do this: java.net.URLEncoder.encode(myParam, request.getCharacterEncoding()); This does not work :) request.getCharacterEncoding() is different from Connector URIEncoding. The request character encoding determines in wich character encodig the parameters value will be return to you. But it doesn't determine in wich character encoding the URI has to be read. Or, you could do what everybody else in the world does and use a tag library or some other tool to emit URLs including parameters, etc. What's the problem with URLEncoder ? I don't get you :) Aah, I get it. I don't believe this is possible. I'd love to hear from a Tomcat developer, though, just to be safe that would be fine :) - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frederic, Frederic Bastian wrote: Christopher Schultz a écrit : You want to do this: java.net.URLEncoder.encode(myParam, request.getCharacterEncoding()); This does not work :) request.getCharacterEncoding() is different from Connector URIEncoding. The request character encoding determines in which character encoding the parameters value will be return to you. My mistake. I meant response.getCharacterEncoding(). But it doesn't determine in wich character encoding the URI has to be read. But you aren't reading a URI. You're writing one. I'm assuming that you want to encode a URI for output into a web page. The web page ought to be written using the response's encoding, not the URIEncoding. What's the problem with URLEncoder ? I don't get you :) Nothing. All the things I mentioned used them at the heart (or should). They just take out the guesswork of which encoding you should be using, and when to apply it. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqQJ29CaO5/Lv0PARAuhvAJ9vmHBfszi95wQyVlV2o36yZ4N8SQCfTqKN puaAISUt3/OrGna00/8dvk4= =B73f -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Caldarale, Charles R a écrit : Once again, it's available via the MBean that Tomcat creates for each Connector element. I'm sorry i should have missed your reply. Could you tell me a bit more about how MBean can solve my problem ? I never used it. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: URIEncoding
From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding how to know the Connector URIEncoding value, inside your application ? :) Once again, it's available via the MBean that Tomcat creates for each Connector element. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Christopher Schultz a écrit : I'd agree that reading a URI is different, but not writing one. Where are you writing your URI? Into the response, I'm guessing. In fact, I'm guessing you're writing it into the response /body/, which ought to be encoded using the response's declared Content-Type (in the HTTP header). The encoding used for reading the URI from the request is irrelevant, here. I disagree. Imagine that you want to write into the response /body/ a link to a google search, where the search parameter is the special char . Example : http://www.google.com/search?q=%26 (so the correct way to write it in your response /body/ is: a href=http://www.google.com/search?q=%26;your search/a) If you just write the link without url encoding or html entities encoding, the link will be wrong : a href=http://www.google.com/search?q=;your search/a If you write the link with html entities encoding, the link will be wrong : a href=http://www.google.com/search?q=amp;;your search/a So you have to URLEncode your parameter, to write it into the response /body/: a href=http://www.google.com/search?q=%26;your search/a So, to generate and write into the response body links that include user inputs into the parameters, you have to URLEncode the parameters, it is an absolute necessity ! And that's the point: you can URLEncode them into different character encodings. And if they are links to your own tomcat server, you need to URLEncode the parameters in the same character encoding than the Connector URIEncoding. What makes you think that the Connector has the right answer in the first place? Because it is the Connector that will read your URI ;) And so, when URLEncoding links to your server, the character encoding has to be the same. You will see that the server does not interpret correctly the parameters, because the Connector URIEncoding is still set to ISO-8859-1. If you are setting the URIEncoding of the Connector to UTF-8 and it's not interpreting it as UTF-8, then Tomcat has a bug I wrote : Connector URIEncoding is still set to ISO-8859-1 ;) ISO-8859-1 is the default value of the Connector URIEncoding. Anyway, if we disagree, let's just get back to the point : how to know the Connector URIEncoding value, inside your application ? :) - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frederic, Frederic Bastian wrote: I'm sorry but I think you don't get it :) Reading and writing URI is totally different from writing the response output. I'd agree that reading a URI is different, but not writing one. Where are you writing your URI? Into the response, I'm guessing. In fact, I'm guessing you're writing it into the response /body/, which ought to be encoded using the response's declared Content-Type (in the HTTP header). The encoding used for reading the URI from the request is irrelevant, here. For instance, you can set the response character encoding to UTF-8 in order to display your html in UTF-8, and set the Connector URIEncoding to ISO-8859-1 to read URI in ISO-8859-1 (and so, you have to encode your URI in ISO-8859-1). Yes, except that most browsers will use the encoding of the previous response to encode the URI (unless you have use UTF-8 URLs turned on in the options -- most browsers have this feature, and I think it's turned on by default these days). For instance, If you want to make a redirection, you just send a redirection header, there is no response output writing, so no matter wich character encoding your web pages are displayed in. Now we're getting somewhere. You didn't mention that you were talking about a redirection URI, which will go into a header. The interesting part now is that HTTP headers do not have a declared character encoding. Most browsers use UTF-8 for URI encoding, but the headers use ASCII from what I can tell from the spec. So... how do you decide which character encoding to use for the URI? You have to guess. It's stupid, but true. The browser will not tell you the encoding it uses. Forcing your Connector to use ISO-8859-1 or UTF-8 is just a guess, too. Using your own code to override the default for the Connector is just adding confusion to a process already fraught with problems. What makes you think that the Connector has the right answer in the first place? The point is that the character encoding of the Connector URIEncoding, and the character encoding of the URLEncoder method, have to be consistent. I believe this to be true only under the following conditions: 1. You are writing a URI to be used in an HTTP header. 2. The URIEncoding used by your Connector was correct in the first place. The only way to tell if the encoding was right in the first place is to encode parameters whose values you /know/ and then check them on the other end to see if the browser really was using UTF-8 or ISO-8859-1 (or whatever). Make the try : set the response character encoding to UTF-8, set the URLEncoder character encoding to UTF-8, generate a web page including links with encoded parameters with special chars, and follow these links. You will see that the server does not interpret correctly the parameters, because the Connector URIEncoding is still set to ISO-8859-1. If you are setting the URIEncoding of the Connector to UTF-8 and it's not interpreting it as UTF-8, then Tomcat has a bug. Since you are the only one experiencing this phenomenon, I'm guessing it's not a bug. If you have everything set to UTF-8 (as I do in my production apps), you should not have this problem. So, for portability purpose, I'd like to make the character encoding of the Connector and of the URLEncoder consistent, without modifying the server.xml file. But it looks pretty impossible :p I disagree that the Connector knows any better than you do about how to encode outgoing URLs. The browser is going to do whatever the heck it wants, and it's not going to tell you what it did. You just have to guess. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqQxo9CaO5/Lv0PARArLrAJsEtuEyh/60diLe+ttSlW4OO/tfIgCeLQwu SvSvLxWBuucFh92vlMAUmu8= =kvt9 -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Christopher Schultz a écrit : -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frederic, Frederic Bastian wrote: Christopher Schultz a écrit : You want to do this: java.net.URLEncoder.encode(myParam, request.getCharacterEncoding()); This does not work :) request.getCharacterEncoding() is different from Connector URIEncoding. The request character encoding determines in which character encoding the parameters value will be return to you. My mistake. I meant response.getCharacterEncoding(). But it doesn't determine in wich character encoding the URI has to be read. But you aren't reading a URI. You're writing one. I'm assuming that you want to encode a URI for output into a web page. The web page ought to be written using the response's encoding, not the URIEncoding. I'm sorry but I think you don't get it :) Reading and writing URI is totally different from writing the response output. For instance, you can set the response character encoding to UTF-8 in order to display your html in UTF-8, and set the Connector URIEncoding to ISO-8859-1 to read URI in ISO-8859-1 (and so, you have to encode your URI in ISO-8859-1). For instance, If you want to make a redirection, you just send a redirection header, there is no response output writing, so no matter wich character encoding your web pages are displayed in. The point is that the character encoding of the Connector URIEncoding, and the character encoding of the URLEncoder method, have to be consistent. Make the try : set the response character encoding to UTF-8, set the URLEncoder character encoding to UTF-8, generate a web page including links with encoded parameters with special chars, and follow these links. You will see that the server does not interpret correctly the parameters, because the Connector URIEncoding is still set to ISO-8859-1. So, for portability purpose, I'd like to make the character encoding of the Connector and of the URLEncoder consistent, without modifying the server.xml file. But it looks pretty impossible :p What's the problem with URLEncoder ? I don't get you :) Nothing. All the things I mentioned used them at the heart (or should). They just take out the guesswork of which encoding you should be using, and when to apply it. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqQJ29CaO5/Lv0PARAuhvAJ9vmHBfszi95wQyVlV2o36yZ4N8SQCfTqKN puaAISUt3/OrGna00/8dvk4= =B73f -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, Caldarale, Charles R wrote: From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding Is there a way to get the value of the param URIEncoding of the Connector, so your code will work, whatever the char encoding of the Connector is ? I'm confused. If the Connector already has the proper URIEncoding value, why do you think the application needs to reprocess the URI with the same encoding? He's trying to beat the Connector into using an application-defined URIEncoding without having to modify server.xml to set up the connector properly. Frederic, why are you trying to do this? Are you deploying an application on a Tomcat over which you have no control? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqPIP9CaO5/Lv0PARAvVyAKCP+bx1oD64GCFI220GSpP7SeN2xgCgp/cd lg85Bvd8Sf+vlk2Rx/JYHNM= =dWBq -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Frederic, Frederic Bastian wrote: The matter is that Tomcat won't get the correct values of the parameters in the URL. For instance : If my URI looks like : http://host/?query=%C3%A9%C3%A8 The URI encoding is UTF-8 By default, Tomcat will read this url in ISO-8859-1. Yes, but you said that you changed the Connector to use UTF-8. Is it not working? Or, are you looking for an alternative so that you don't /have to/ set the Connector's URIEncoding? If I add into server.xml the attribut URIEncoding=UTF-8 to the Connector, Tomcat will correctly read the query parameter. I would like Tomcat to read correctly URL in UTF-8, but without modifying server.xml. Aah, I get it. I don't believe this is possible. I'd love to hear from a Tomcat developer, though, just to be safe. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGqPC89CaO5/Lv0PARAsppAKCk/bkUphulQmhqOt9rsvvM6kWwKgCgjQkM CtW/PpMkqPJfrMF9I6Jz/fQ= =tfXB -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: URIEncoding
From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding The point is that I need to use the java.net.URLEncoder.encode() method, e.g. java.net.URLEncoder.encode(myParam, UTF-8). O.k., so now it appears you need to know the encoding in order to do it properly on the output side, whereas all the examples being tossed around earlier in this thread were concerned with the request URI, not a generated one. That explains quite a bit. I would like to use URLEncoder.encode() method with the character encoding UTF-8 (W3C recommendations). So, I MUST modify the Connector URIEncoding parameter, but I don't want to, to improve portability. It's available via the MBeans created for each Connector, so you could get at it with JMX. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Caldarale, Charles R a écrit : From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding Is there a way to get the value of the param URIEncoding of the Connector, so your code will work, whatever the char encoding of the Connector is ? I'm confused. If the Connector already has the proper URIEncoding value, why do you think the application needs to reprocess the URI with the same encoding? The point is that I need to use the java.net.URLEncoder.encode() method, e.g. java.net.URLEncoder.encode(myParam, UTF-8). Using a different character encoding than the Connector URIEncoding leads to problems ; for instance, if the Connector URIEncoding is set to ISO-8859-1 (default value), and the URLEncoder.encode() method set to UTF-8 = problems (one obvious solution is to modify the URIEncoding) I would like to use URLEncoder.encode() method with the character encoding UTF-8 (W3C recommendations). So, I MUST modify the Connector URIEncoding parameter, but I don't want to, to improve portability. So I would like to manage this problem in the application itself rather than in server.xml, for portability purposes. The only solution I see is to find a way to get the value of the URIEncoding parameters. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: URIEncoding
From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding Is there a way to get the value of the param URIEncoding of the Connector, so your code will work, whatever the char encoding of the Connector is ? I'm confused. If the Connector already has the proper URIEncoding value, why do you think the application needs to reprocess the URI with the same encoding? - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
How about: String queryString = HttpServletRequest.getParameter(query); queryString = new String(queryString.getBytes(iso-8859-1), UTF-8); Its not very graceful so you can even make a 1-line-method for doing this and have: decodeURIParams(a, b, c) { return new String((HttpServletRequest.getParameter(a)).getBytes(b), c); } String queryString = decodeURIParams(query, URI_ENCODING_CONST, URI_DECODING_CONST)); This is all pseudo-code but I hope you see what I mean. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi Pulkit, thanks for your answer. The matter is that Tomcat won't get the correct values of the parameters in the URL. For instance : If my URI looks like : http://host/?query=%C3%A9%C3%A8 The URI encoding is UTF-8 By default, Tomcat will read this url in ISO-8859-1. So HttpServletRequest.getParameter(query) will return an incorrect value. The solution you proposed won't help Tomcat to return a correct value with the getParameter method. If I add into server.xml the attribut URIEncoding=UTF-8 to the Connector, Tomcat will correctly read the query parameter. I would like Tomcat to read correctly URL in UTF-8, but without modifying server.xml. Any suggestion ? Pulkit Singhal a écrit : Hi Frederic, I don't know about HttpSession.method for settign the URIEncoding. But you could always do somethign along the lines of: String uri_utf8 = new String (uri.getBytes(iso-8859-1), UTF-8); inside the application. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi folks :) I need my URI to be in UTF-8. In server.xml, I added to the Connector the attribut : URIEncoding=UTF-8 This works well. But my question is : Is there a way to define the URIEncoding in the application itself ? For instance, you can modify the session timeout in the application itself (HttpSession.setMaxInactiveInterval()). I would like to modify the URIEncoding by the same way. Would anyone know how to achieve that ? Thanks. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Frederic Bastian, PhD student Department of Ecology and Evolution Biophore, University of Lausanne, 1015 Lausanne, Switzerland. tel: +41 21 692 4221 http://www.unil.ch/dee/page22707.html Swiss Institute of Bioinformatics http://www.isb-sib.ch/ - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Hi Pulkit, thanks for your answer. The matter is that Tomcat won't get the correct values of the parameters in the URL. For instance : If my URI looks like : http://host/?query=%C3%A9%C3%A8 The URI encoding is UTF-8 By default, Tomcat will read this url in ISO-8859-1. So HttpServletRequest.getParameter(query) will return an incorrect value. The solution you proposed won't help Tomcat to return a correct value with the getParameter method. If I add into server.xml the attribut URIEncoding=UTF-8 to the Connector, Tomcat will correctly read the query parameter. I would like Tomcat to read correctly URL in UTF-8, but without modifying server.xml. Any suggestion ? Pulkit Singhal a écrit : Hi Frederic, I don't know about HttpSession.method for settign the URIEncoding. But you could always do somethign along the lines of: String uri_utf8 = new String (uri.getBytes(iso-8859-1), UTF-8); inside the application. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi folks :) I need my URI to be in UTF-8. In server.xml, I added to the Connector the attribut : URIEncoding=UTF-8 This works well. But my question is : Is there a way to define the URIEncoding in the application itself ? For instance, you can modify the session timeout in the application itself (HttpSession.setMaxInactiveInterval()). I would like to modify the URIEncoding by the same way. Would anyone know how to achieve that ? Thanks. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Frederic Bastian, PhD student Department of Ecology and Evolution Biophore, University of Lausanne, 1015 Lausanne, Switzerland. tel: +41 21 692 4221 http://www.unil.ch/dee/page22707.html Swiss Institute of Bioinformatics http://www.isb-sib.ch/ - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Hi Frederic, I don't know about HttpSession.method for settign the URIEncoding. But you could always do somethign along the lines of: String uri_utf8 = new String (uri.getBytes(iso-8859-1), UTF-8); inside the application. On 7/26/07, Frederic Bastian [EMAIL PROTECTED] wrote: Hi folks :) I need my URI to be in UTF-8. In server.xml, I added to the Connector the attribut : URIEncoding=UTF-8 This works well. But my question is : Is there a way to define the URIEncoding in the application itself ? For instance, you can modify the session timeout in the application itself (HttpSession.setMaxInactiveInterval()). I would like to modify the URIEncoding by the same way. Would anyone know how to achieve that ? Thanks. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding
Thanks for your help, that answers my question pretty well :) Caldarale, Charles R a écrit : From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding Could you tell me a bit more about how MBean can solve my problem ? I never used it. Tomcat creates MBeans for most of its internal objects, including connectors: http://tomcat.apache.org/tomcat-6.0-doc/mbeans-descriptor-howto.html (The same is true for 5.5 if that's what you're using.) The get/set methods of the MBean allow you to inspect and modify the underlying objects. Start Tomcat with -Dcom.sun.management.jmxremote and use JConsole to poke around inside it. Look at the MBeans tab, then down the Catalina - Connector - [port#] - Attributes branch; you should see URIEncoding as the first entry. Study the javax.mananagement.* APIs for details on how to access MBeans. You may want to look at this tutorial as well: http://java.sun.com/j2se/1.5.0/docs/guide/management/overview.html For examples of code that interrogates various MBeans within Tomcat, wander through the Lambda Probe source: http://lambdaprobe.org/d/index.htm Using any of this makes your application Tomcat-specific, of course. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: URIEncoding
From: Frederic Bastian [mailto:[EMAIL PROTECTED] Subject: Re: URIEncoding Could you tell me a bit more about how MBean can solve my problem ? I never used it. Tomcat creates MBeans for most of its internal objects, including connectors: http://tomcat.apache.org/tomcat-6.0-doc/mbeans-descriptor-howto.html (The same is true for 5.5 if that's what you're using.) The get/set methods of the MBean allow you to inspect and modify the underlying objects. Start Tomcat with -Dcom.sun.management.jmxremote and use JConsole to poke around inside it. Look at the MBeans tab, then down the Catalina - Connector - [port#] - Attributes branch; you should see URIEncoding as the first entry. Study the javax.mananagement.* APIs for details on how to access MBeans. You may want to look at this tutorial as well: http://java.sun.com/j2se/1.5.0/docs/guide/management/overview.html For examples of code that interrogates various MBeans within Tomcat, wander through the Lambda Probe source: http://lambdaprobe.org/d/index.htm Using any of this makes your application Tomcat-specific, of course. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: URIEncoding and POSTS
Mike Wannamaker wrote: I can specify URIEncoding=UTF-8 in Tomcat's connector settings within the server.xml file. Now, my Tomcat server correctly reads the URL GET parameters correctly...sending out Hello, José! or Hello, 田中! as expected. However, there's still a problem. What if I want to POST some non-ASCII data, presumably to enter into a backend database? All is well since I set that URIEncoding flag, right? Wrong. It turns out that Tomcat, doesn't use this URIEncoding flag for POSTed form data. So, what does it use? ISO-8859-1 of course! So now, I'm back to where I started, and my imaginary application still greets Mr. ç”°ä¸ instead of Mr. 田中. Not good. Why is this so? Can I get the POST to behave the same as the GET?? You need to set the request encoding before reading the parameters. You can do this explicitly (see http://marc.theaimsgroup.com/?l=tomcat-userm=111548442910292w=2) or globally using a filter. Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: URIEncoding and POSTS
Use a servlet filter, like: public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { logger.debug(set request character encoding to + encoding); request.setCharacterEncoding(encoding); // move on to the next chain.doFilter(request, response); } -Original Message- From: Mike Wannamaker [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 12, 2006 4:51 PM To: 'Tomcat Users List' Subject: URIEncoding and POSTS I can specify URIEncoding=UTF-8 in Tomcat's connector settings within the server.xml file. Now, my Tomcat server correctly reads the URL GET parameters correctly...sending out Hello, José! or Hello, 田中! as expected. However, there's still a problem. What if I want to POST some non-ASCII data, presumably to enter into a backend database? All is well since I set that URIEncoding flag, right? Wrong. It turns out that Tomcat, doesn't use this URIEncoding flag for POSTed form data. So, what does it use? ISO-8859-1 of course! So now, I'm back to where I started, and my imaginary application still greets Mr. ç”°ä¸ instead of Mr. 田中. Not good. Why is this so? Can I get the POST to behave the same as the GET?? Mike Wannamaker Senior Software Developer - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]