SSI Servlet Character Encoding Problem
Hi people, I use Tomcat 5.5.9 with Apache 2.0.54 and jk 1.2.10 to serve my websites. I want to set custom error pages to be served when an error like 404, 500 etc. occurs. The website uses the iso-8859-9 character set on every page, and the error pages are encoded with iso-8859-9 too. Only *.jsp pages and servlets are mapped to Tomcat through jk. All other static content is served by Apache. My error pages are all Server Side Include shtml files. These error files include some jsp files with: !--#include virtual=file-name -- I have set up SSI (correctly I hope) on both Apache and Tomcat, and also set up custom error file handling on both httpd.conf and the default web.xml for the site I want to customize. On my server.xml file, my site is configured like this: Host name= http://www.foo.com www.foo.com appBase=/usr/www/foo Context path= docBase=/ Context path=/servlet docBase=servlet/ /Host On my default web.xml for tomcat configuration, SSI servlet has these directives added: init-param param-nameinputEncoding/param-name param-valueiso-8859-9/param-value /init-param init-param param-nameoutputEncoding/param-name param-valueiso-8859-9/param-value /init-param When I go to some nonexistent foo.html on my website, Apache handles the customized error page and everything works fine. On the other hand I have two problems: One is, if the isVirtualWebappRelative variable in web.xml is set to 0, and I go to a nonexistent foo.jsp on my server, jk sends the request to Tomcat and when Tomcat tries to handle the customized error page, it cannot find the included files in SSI and an error is written in the Tomcat error log like: SEVERE: ssi: #include--Couldn't include file: /include/footer.jsp java.io.IOException: Couldn't get context for path: /include/footer.jsp The /include folder is in the server root. I think it should be able to find these pages, as it looks at the server root because of the state of isVirtualWebappRelative variable, am I wrong? When I set isVirtualWebappRelative to 1, this problem is solved only for server root, my servlets (say /servlet) still cannot get the customized page includes, because this time the SSI includes become web app relative and I have to copy the /include folder into /servlet/include directory to work around this problem. My second problem is, when I go to this nonexistent foo.jsp with isVirtualWebappRelative set to 1, and when Tomcat tries to handle the error page, the encoding is always UTF-8 regardless of inputEncoding or outputEncoding variables. So my error page becomes full of garbled characters because the encoding should be iso-8859-9. Is there any suggestions about these problems? Best regards, Kerem
Tomcat 5.5 character encoding problem
Hello! I've tried my luck with Tomcat 5.5 and found that it behaves different than 5.0.27 does as regards character encoding. Somehow the conversion from ISO-8859-1 to UTF-8 doesn't work as it should. This may well be due to a misconfiguration. See below for the JSP document and the two different output pages generated. It looks as if Tomcat 5.5 applies the ISO-8859-1 to UTF-8 conversion twice. I'm editing the JSP document with Eclipse 3.1M1 and the text file encoding is set to ISO-8859-1. I'm still pretty new to servlets, JSP, and Tomcat, so I might be overlooking something rather obvious. Michael test.jspx: ?xml version=1.0 encoding=ISO-8859-1? html xmlns=http://www.w3.org/1999/xhtml; xmlns:jsp=http://java.sun.com/JSP/Page; xmlns:c=http://java.sun.com/jsp/jstl/core; xmlns:fmt=http://java.sun.com/jsp/jstl/fmt; lang='de' jsp:output doctype-root-element=html doctype-public=-//W3C//DTD XHTML Strict 1.0//EN doctype-system=http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd; / jsp:directive.page contentType=text/html / jsp:directive.page session=false / head titleTest/title /head body br/ br/ br/ br/ br/ br/ #160;br/ /body /html With Tomcat 5.5 (wrong) http://localhost:8080/myapp/test.jspx: ?xml version=1.0 encoding=UTF-8? !DOCTYPE html PUBLIC -//W3C//DTD XHTML Strict 1.0//EN http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd; html xmlns=http://www.w3.org/1999/xhtml; lang=deheadtitleTest/title/headbody br/ br/ br/ br/ br/ br/ br//body/html With Tomcat 5.0.27 (correct) http://localhost:8080/myapp/test.jspx: ?xml version=1.0 encoding=UTF-8? !DOCTYPE html PUBLIC -//W3C//DTD XHTML Strict 1.0//EN http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd; html xmlns=http://www.w3.org/1999/xhtml; lang=deheadtitleTest/title/headbody br/ br/ br/ br/ br/ br/ br//body/html -- Michael SchuerigI was blessed with a birth and a death and mailto:[EMAIL PROTECTED] I guess I just want some say in between http://www.schuerig.de/michael/ --Ani DiFranco, Talk To Me Now - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
Thanks for the information Anton. But just getting rid of umlauts or other international characters is not an option when you have clients that use your software in other countries, that have those special characters. We cannot rename user files or changed that data. That would be very, very, bad :) -Original Message- From: Anton Tagunov [mailto:[EMAIL PROTECTED] Sent: Saturday, September 06, 2003 5:46 AM To: Tomcat Users List Subject: Re: Character Encoding problem (umlauts, etc). Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Character Encoding problem (umlauts, etc).
Robert Priest schrieb: I have a servlet that catches a request for a file. How is the request sent? If sent via an HTML form, you need to include the accept-charset=UTF-8 attribute into your form tag Thomas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
This problem can usually be fixed by changing the file.encoding system property. Set CATALINA_OPTS to -Dfile.encoding=utf-8 (or iso-8859-1 or whatever character set you like) and restart tomcat Hope this helps Andy -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: 08 September 2003 14:18 To: 'Tomcat Users List' Subject: RE: Character Encoding problem (umlauts, etc). Thanks for the information Anton. But just getting rid of umlauts or other international characters is not an option when you have clients that use your software in other countries, that have those special characters. We cannot rename user files or changed that data. That would be very, very, bad :) -Original Message- From: Anton Tagunov [mailto:[EMAIL PROTECTED] Sent: Saturday, September 06, 2003 5:46 AM To: Tomcat Users List Subject: Re: Character Encoding problem (umlauts, etc). Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re[2]: Character Encoding problem (umlauts, etc).
Hello Robert! RP Thanks for the information Anton. But just getting rid of umlauts or other RP international characters is not an option when you have clients that use RP your software in other countries, that have those special characters. We RP cannot rename user files or changed that data. That would be very, very, bad RP :) Well, getting rid of them was only my second suggestion. The first one was that we should make sure they all get %xy encoded in the url. So, first of all, have you detected if it is your browser or Tomcat to blame? I advice you to spy the traffic and see for yourself what you browser is sending. Is the browser sending non-ascii chars as '?'-s? Is the browser sending them as raw 8-bit text? Is the browser %xy-encoding them? (The last will be the effect we want. All other don't meet our needs). As I've said before the methods to spy these (the methods that I have know, there should be others :-) are on my page at tagunov.tripod.com. If you can supply this info -- what is the browser actually sending -- we will be able to move further on with your needs. Good luck! Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Character Encoding problem (umlauts, etc).
Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Character Encoding problem (umlauts, etc).
I have a servlet that catches a request for a file. But if that file has characters such as an umlaut in it (for example: ä), the path info is all wrong. For example: I am requesting file : /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt but what is coming across in the request is: /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt I have tried: String requestPathInfo5 = new String(request.getPathInfo().getBytes(ISO-8859-1)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(Unicode)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UTF8)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UnicodeLittle)); But none of them are returning correctly. Does anyone know what the correct know what is the correct unicode encoding I should have? Any other suggestions? I know this problem has been solved before so If you could point me in the direction of the solution on the web that is fine. THanks in advance. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
This is in a JSP page (which of course becomes a servlet). Do I have to set the encoding in Tomcat perhaps? -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: Thursday, September 04, 2003 5:16 PM To: '[EMAIL PROTECTED]' Subject: Character Encoding problem (umlauts, etc). I have a servlet that catches a request for a file. But if that file has characters such as an umlaut in it (for example: ä), the path info is all wrong. For example: I am requesting file : /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt but what is coming across in the request is: /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt I have tried: String requestPathInfo5 = new String(request.getPathInfo().getBytes(ISO-8859-1)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(Unicode)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UTF8)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UnicodeLittle)); But none of them are returning correctly. Does anyone know what the correct know what is the correct unicode encoding I should have? Any other suggestions? I know this problem has been solved before so If you could point me in the direction of the solution on the web that is fine. THanks in advance. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
The FAQ ( http://jakarta.apache.org/tomcat/faq ) has a link to a thread on How to UTF-8 your site, which I think might be similar. http://marc.theaimsgroup.com/?l=tomcat-userm=105524426515137w=2 is the link to the thread itself. Try some of the things there and see if they work for you. (specifically, starting Tomcat with a -Dfile.encoding=UTF-8 switch) Jeff Tulley ([EMAIL PROTECTED]) (801)861-5322 Novell, Inc., The Leading Provider of Net Business Solutions http://www.novell.com [EMAIL PROTECTED] 9/4/03 3:24:58 PM This is in a JSP page (which of course becomes a servlet). Do I have to set the encoding in Tomcat perhaps? -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: Thursday, September 04, 2003 5:16 PM To: '[EMAIL PROTECTED]' Subject: Character Encoding problem (umlauts, etc). I have a servlet that catches a request for a file. But if that file has characters such as an umlaut in it (for example: ä), the path info is all wrong. For example: I am requesting file : /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt but what is coming across in the request is: /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt I have tried: String requestPathInfo5 = new String(request.getPathInfo().getBytes(ISO-8859-1)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(Unicode)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UTF8)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UnicodeLittle)); But none of them are returning correctly. Does anyone know what the correct know what is the correct unicode encoding I should have? Any other suggestions? I know this problem has been solved before so If you could point me in the direction of the solution on the web that is fine. THanks in advance. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Character Encoding problem
On Wednesday 28 August 2002 13:17, you wrote: Hi I am using tomact 4.0.4 and JDK1.3.1 I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to the browser works fine I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the code when I am using it) is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT version to work with but they need to be release version and not beta or something like this You can use followed form just with right encoding for you, W3C has foreseen the atribute accept-charset for element/tag form. Then your request object will have right encoding too form action=action.jsp method=POST accept-charset=ISO-8859-5 ~~~ ... /form ilis -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Character Encoding problem
Nehemia Litterat [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Hi I am using tomact 4.0.4 and JDK1.3.1 I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to the browser works fine I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the code when I am using it) This is almost certainly due to having an older version of servlet.jar in your classpath. One especial got-ya is to have on older version of j2ee.jar and/or servlet.jar in $JAVA_HOME/jre/lib/ext. If this is the case, kill it. Otherwise, try compiling with: javac -classpath $CATALINA_HOME/common/lib/servlet.jar:$CLASSPATH MyServlet.java is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT version to work with but they need to be release version and not beta or something like this blatant-plug Tomcat 3.3.1 has excellent charset support, especially if you are willing to use it's non-portable features. /blatant-plug However, once you solve your compilation problem, 4.x should do everything that you need it to do (and portably as well :). Thanks in advance Nehemia Litterat - Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Character Encoding problem
Hi I am using tomact 4.0.4 and JDK1.3.1 I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to the browser works fine I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the code when I am using it) is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT version to work with but they need to be release version and not beta or something like this Thanks in advance Nehemia Litterat - Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes
Re: Character Encoding problem
Perhaps something like this is tomcat/bin/setclasspath.sh JAVA_OPTS=-Dfile.encoding=ISO-8859-1 (with you ISO configuration, of course). Fabio. Nehemia Litterat wrote: Hi I am using tomact 4.0.4 and JDK1.3.1 I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to the browser works fine I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the code when I am using it) is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT version to work with but they need to be release version and not beta or something like this Thanks in advance Nehemia Litterat - Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes -- Fabio Mengue - Centro de Computacao - Unicamp [EMAIL PROTECTED] [EMAIL PROTECTED] Quem se mata de trabalhar merece mesmo morrer. - Millor -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
character encoding problem
hi, since two weeks I'm fighting with a character encoding problem without success: I send a JSP form to a Tomcat 4.03 servlet and log the form input with log4j into a file and into a database. Running on a german server with Suse Linux everything works fine. Now I installed Tomcat and servlet on a Slackware server in the US and all special characters are logged as a question mark. I tried everything I could get hands on in the mailing list archive, without success: - %@ page contentType=text/html;charset=iso-8859-1 % in the JSP - meta http-equiv=Content-Type value=text/html; charset=iso-8859-1 in the JSP - start tomcat with CATALINA_OPTS=-Dfile.encoding=iso-8859-1 - set environment variables LC_ALL=de;export LC_ALL and LANG=de;export LANG (locale -a on slackware yields de, deutsch or german) - request.setCharacterEncoding(iso-8859-1); as first statement in servlet.doPost() - convert form parameters like: byte[] bytes = param.getBytes(iso-8859-1); String convertedParam = new String( bytes, iso-8859-1 ); I tried everything with iso-8859-1 and with UTF-8. With a perl script I have no problems handling the form input, so it must be a Java/Tomcat problem. What else could I do to change the platform's default encoding? please help Wolfgang -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Character encoding problem: strange rash
All, Tomcat 4.0.1 Apache 1.3 WARP Solaris 8 JDK 1.2/1.3 Does anyone know why a servlet would suddenly start displaying non-breaking spaces (#160;) as question marks (?) when the JDK/SDK is upgraded from 1.2 to 1.3? Very odd behaviour! Like a rash - question marks all over the place :) The servlet was written for a version of the Servlet API 2.3, so it doesn't set the character encoding. BUT I don't see how this would have any bearing on the problem, since the JDK has nothing to do with the Servlet API? Thanks, John -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
How to solve the character encoding problem with JSP
Hello all Tomcat users! I'm new to Tomcat. Installed it yesterday and it seems to me running very well. Only one question is arised, which probably is answered already, but I did not found answer for that from archive. Currently in production we have JavaWebServer2.0 / Win NT, JDK1.3 and this works very well for solving encoding problem with JSP: /// // // How to solve the character encoding problem with JSP // // // 1. Comment out this line in generated file's _jspService method: // JspWriter out = null; // 2. Insert this line where you commented out number 1: // PrintWriter out = response.getWriter(); // 3. Comment out this line in generated file's _jspService method: // out = pageContext.getOut(); // 4. At the end of the _jspService method you find: // } catch (Throwable t) { // * if (out.getBufferSize() != 0) // * out.clear(); // Lines marked with * shall be commented out. // // 5. After lines mentioned in point 4 you find: // HandleErrorPageException("...errorpage.jsp", t, out); //Change it to: // HandleErrorPageException("...errorpage.jsp", t, pageContext.getOut()); /// Actually there are some Lexicon files in Cp1251, Cp1257 and UTF-8 encodings, which need to come through JSP to browser. With JavaWebServer2.0 this is ok, if we do as described in steps 1.-5. How to do it with Tomcat? I did the same but it seems to me no effect at all. Only marks or unreadable characters. What is missing? Thanks in advance for any kind of info/URL/help, Andre Tampld Software Development @ EMK
Character Encoding Problem
Hi, I know that this is a popular (!?) problem in tomcat. Dur despite my efforts I could not find any solution. Here it goes: We have jsp page in encoding type ISO-8859-9. With the line %@ page contentType = "text/html; charset=ISO-8859-9" % we define the encoding type of the document. Result: The strings in the jsp file are encoded correctly, like - The text in html - the text written inside jsp code between %, % Problem: The strings that are recevied from a class, read from a file, inside a constant are not encoded correctly. %= "Pretend to be a ISO8859-9 string" % is encoded correctly but, inside a class let's say we have String x = "Pretend to be a ISO8859-9 string"; then %= myClass.c % is encoded wrong with many ?'s The problem is not in JVM/JDK, I think, as I read a file and write the content into another file, both files are the same with no loss in encoding. Also the encoding problem is there when I include files with jsp:include .. If I include the file with %@ include .. the problem is not there .. But I really need to use jsp:include .. Can anyone solve my problem ?? .. Arif Tumer ..
RE: Character Encoding Problem
Hi, You should compile the java classes with ISO-8859-9 encoding. Look at the -encoding flag of the 'javac' compiler. In compilation the 8-byte characters in strings are converted to unicode characters. By default the encoding is probably ISO-8859-1. Regards, Tõnu -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, July 02, 2001 10:33 AM To: [EMAIL PROTECTED] Subject: Character Encoding Problem Hi, I know that this is a popular (!?) problem in tomcat. Dur despite my efforts I could not find any solution. Here it goes: We have jsp page in encoding type ISO-8859-9. With the line %@ page contentType = text/html; charset=ISO-8859-9 % we define the encoding type of the document. Result: The strings in the jsp file are encoded correctly, like - The text in html - the text written inside jsp code between %, % Problem: The strings that are recevied from a class, read from a file, inside a constant are not encoded correctly. %= Pretend to be a ISO8859-9 string % is encoded correctly but, inside a class let's say we have String x = Pretend to be a ISO8859-9 string; then %= myClass.c % is encoded wrong with many ?'s The problem is not in JVM/JDK, I think, as I read a file and write the content into another file, both files are the same with no loss in encoding. Also the encoding problem is there when I include files with jsp:include .. If I include the file with %@ include .. the problem is not there .. But I really need to use jsp:include .. Can anyone solve my problem ?? .. Arif Tumer ..
lgi:RE: Character Encoding Problem
Kimden: Tnu Pld [EMAIL PROTECTED] Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00 Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED] Konu: RE: Character Encoding Problem Hi, You should compile the java classes with ISO-8859-9 encoding. Look at the -encoding flag of the 'javac' compiler. In compilation the 8-byte characters in strings are converted to unicode characters. By default the encoding is probably ISO-8859-1. Regards, Tnu I tried, a few minutes ago, but the problem remains. I don't know the usage of -encoding parameter in detail but, I think it is related with the string constants in the source files. The main problem is the encoding when I read a text file and show its contents in the jsp with something like, %= FileReader.readLine() % And as I mentioned, when I load a file and write it into another file, the encoding is correct. The result of my string manipulation functions are correct. The single, devastating, problem occurs when I pass a string with characters in ISO-8859-9, from a class to jsp. Ah, jsp:include also still is a problem .. :( Himm .. It is a stupid thought but is it something to do with dynamic strings vs. static strings. As far as I know when %@ include.. is used the file is statically included, at compile time, but jsp:include.. may include the file dynamically after compilation. Also the string constants in jsp files are compiled, but strings read from files are read and generated in execution time .. Himmm .. I think I am going paranoic :) .. Thanks, anyway .. Has anyone encountered this kind of problem before ?? Arif ..
RE: Character Encoding Problem
Hi, Another problem related with the charset type is when I use the following code strErrorMsg = "a message using ISO-8859-9"; INPUT TYPE="HIDDEN" Name="errormsg" VALUE="=strErrorMsg" and post the form, the receiving jsp file does not print the strErroMsg variable correctly. How can this problem be solved? Thanks, Oner Necip Hamali. Kimden: [EMAIL PROTECTED] Tarih: 2001/07/02 Mon PM 12:16:03 GMT+03:00 Kime: [EMAIL PROTECTED] Konu: lgi:RE: Character Encoding Problem Kimden: Tnu Pld [EMAIL PROTECTED] Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00 Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED] Konu: RE: Character Encoding Problem Hi, You should compile the java classes with ISO-8859-9 encoding. Look at the -encoding flag of the 'javac' compiler. In compilation the 8-byte characters in strings are converted to unicode characters. By default the encoding is probably ISO-8859-1. Regards, Tnu I tried, a few minutes ago, but the problem remains. I don't know the usage of -encoding parameter in detail but, I think it is related with the string constants in the source files. The main problem is the encoding when I read a text file and show its contents in the jsp with something like, %= FileReader.readLine() % And as I mentioned, when I load a file and write it into another file, the encoding is correct. The result of my string manipulation functions are correct. The single, devastating, problem occurs when I pass a string with characters in ISO-8859-9, from a class to jsp. Ah, jsp:include also still is a problem .. :( Himm .. It is a stupid thought but is it something to do with dynamic strings vs. static strings. As far as I know when %@ include.. is used the file is statically included, at compile time, but jsp:include.. may include the file dynamically after compilation. Also the string constants in jsp files are compiled, but strings read from files are read and generated in execution time .. Himmm .. I think I am going paranoic :) .. Thanks, anyway .. Has anyone encountered this kind of problem before ?? Arif ..
RE: ?lgi:RE: Character Encoding Problem
Just a thought about the jsp:include... problem: try to place the %@page contentType= in each file you include? When reading bytes from file with FileReader the default character encoding is used. I think you must specify your own encoding when reading the file. The suns javadoc says about FileReader class: Convenience class for reading character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate. To specify these values yourself, construct an InputStreamReader on a FileInputStream. Tõnu -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, July 02, 2001 11:16 AM To: [EMAIL PROTECTED] Subject: ?lgi:RE: Character Encoding Problem Kimden: Tõnu Põld [EMAIL PROTECTED] Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00 Kime: '[EMAIL PROTECTED]' [EMAIL PROTECTED] Konu: RE: Character Encoding Problem Hi, You should compile the java classes with ISO-8859-9 encoding. Look at the -encoding flag of the 'javac' compiler. In compilation the 8-byte characters in strings are converted to unicode characters. By default the encoding is probably ISO-8859-1. Regards, Tõnu I tried, a few minutes ago, but the problem remains. I don't know the usage of -encoding parameter in detail but, I think it is related with the string constants in the source files. The main problem is the encoding when I read a text file and show its contents in the jsp with something like, %= FileReader.readLine() % And as I mentioned, when I load a file and write it into another file, the encoding is correct. The result of my string manipulation functions are correct. The single, devastating, problem occurs when I pass a string with characters in ISO-8859-9, from a class to jsp. Ah, jsp:include also still is a problem .. :( Himm .. It is a stupid thought but is it something to do with dynamic strings vs. static strings. As far as I know when %@ include.. is used the file is statically included, at compile time, but jsp:include.. may include the file dynamically after compilation. Also the string constants in jsp files are compiled, but strings read from files are read and generated in execution time .. Himmm .. I think I am going paranoic :) .. Thanks, anyway .. Has anyone encountered this kind of problem before ?? Arif ..
RE: Character Encoding Problem
Kimden: Tnu Pld [EMAIL PROTECTED] Tarih: 2001/07/02 Mon AM 11:55:51 GMT+03:00 Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED] Konu: RE: ?lgi:RE: Character Encoding Problem Just a thought about the jsp:include... problem: try to place the "%@page contentType=" in each file you include? The line is there, in all files :) .. When reading bytes from file with FileReader the default character encoding is used. I think you must specify your own encoding when reading the file. I'll try that. But the same compiled classes and the same jdk version works well with Resin JSP Server and the files. The problem occurs with Tomcat. Thanks, Arif ..
RE: Character Encoding Problem
When reading bytes from file with FileReader the default character encoding is used. I think you must specify your own encoding when reading the file. I'll try that. But the same compiled classes and the same jdk version works well with Resin JSP Server and the files. The problem occurs with Tomcat. Nay, still problems. A lot of ?'s in the visual output. Himpf. The point is, which is interesting, if a have an html form with ISO-8859-9 encoded chars, and post it to a jsp file, and write the parameters to a system text file, the text is correct!! The problem seems to occur during displaying strings obtained from inside a class .. Any other display problems with other character sets ?? .. Any more idea ?? .. Let's remember the problem. We cannot display ISO-8859-9 encoded string constants of a class, or strings read from a file, in jsp documents in correct encoding. Usage of %@ page content-type does not solve Usage of javac -encoding does not solve Usage of encodings in file read/writes of java.io routines does not solve The problem is with Tomcat, no problems with Resin used in the same environment, OS/JDK. Arif ..
RE: Character Encoding Problem
Hi, I still believe your initial bytes are converted to java strings (unicode) using a wrong encoding. If you have a string created from bytes using the ISO-8859-9 encoding, and if the JSP page has a directive %@ page content-type=ISO-8859-9%, then it should be OK. For debuging you could try to convert your string to another encoding, look what happens. For example: %@ page content-type=ISO-8859-9% String s = new String( initalString.getBytes(ISO-8859-1), ISO-8859-9); %= s % If this dislays your string correctly, then you have used the ISO-8859-1 encoding in creation of a java string from inital bytes! By the way which version of Tomcat are you using. An older release (3.2.1) had some bugs with encoding conversion. Try the latest 3.2.2 release. The request parameters from HTTP post are probably in ISO-8859-1 encoding because most browsers do not specify the encoding when submiting a request, so Tomcat uses the default encoding. To convert them correctly to java strings encoding, the following could be used (assuming that they really are ISO-8859-9): String param = new String( initalParam.getBytes(ISO-8859-1), ISO-8859-9); Regards, Tõnu -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, July 02, 2001 3:34 PM To: [EMAIL PROTECTED] Subject: RE: Character Encoding Problem When reading bytes from file with FileReader the default character encoding is used. I think you must specify your own encoding when reading the file. I'll try that. But the same compiled classes and the same jdk version works well with Resin JSP Server and the files. The problem occurs with Tomcat. Nay, still problems. A lot of ?'s in the visual output. Himpf. The point is, which is interesting, if a have an html form with ISO-8859-9 encoded chars, and post it to a jsp file, and write the parameters to a system text file, the text is correct!! The problem seems to occur during displaying strings obtained from inside a class .. Any other display problems with other character sets ?? .. Any more idea ?? .. Let's remember the problem. We cannot display ISO-8859-9 encoded string constants of a class, or strings read from a file, in jsp documents in correct encoding. Usage of %@ page content-type does not solve Usage of javac -encoding does not solve Usage of encodings in file read/writes of java.io routines does not solve The problem is with Tomcat, no problems with Resin used in the same environment, OS/JDK. Arif ..