RE: Character Encoding problem (umlauts, etc).
Thanks for the information Anton. But just getting rid of umlauts or other international characters is not an option when you have clients that use your software in other countries, that have those special characters. We cannot rename user files or changed that data. That would be very, very, bad :) -Original Message- From: Anton Tagunov [mailto:[EMAIL PROTECTED] Sent: Saturday, September 06, 2003 5:46 AM To: Tomcat Users List Subject: Re: Character Encoding problem (umlauts, etc). Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Character Encoding problem (umlauts, etc).
Robert Priest schrieb: I have a servlet that catches a request for a file. How is the request sent? If sent via an HTML form, you need to include the accept-charset=UTF-8 attribute into your form tag Thomas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
This problem can usually be fixed by changing the file.encoding system property. Set CATALINA_OPTS to -Dfile.encoding=utf-8 (or iso-8859-1 or whatever character set you like) and restart tomcat Hope this helps Andy -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: 08 September 2003 14:18 To: 'Tomcat Users List' Subject: RE: Character Encoding problem (umlauts, etc). Thanks for the information Anton. But just getting rid of umlauts or other international characters is not an option when you have clients that use your software in other countries, that have those special characters. We cannot rename user files or changed that data. That would be very, very, bad :) -Original Message- From: Anton Tagunov [mailto:[EMAIL PROTECTED] Sent: Saturday, September 06, 2003 5:46 AM To: Tomcat Users List Subject: Re: Character Encoding problem (umlauts, etc). Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Character Encoding problem (umlauts, etc).
Hello Robert! Robert Priest [EMAIL PROTECTED] wrote: RP I am requesting file : RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt RP but what is coming across in the request is: RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt Probably your browser is sending it that way? I guess it is a bad idea anyways to type anything nasty in the browser URL input line. You may try to spy your interaction between browser and server, I have described how to do it in one of the sections of my ancient http://tagunov.tripod.com, try to find it there, then you'll know for sure what bytes are sent by browser. I guess that it is generally a bad idea to have anything nasty in the url at all. The closest you could get would be to encode it all as %AD and etc. But then you should be sure what encoding this is (utf-8 or anything). So, if these are links from your HTML page, why don't you encode all in the url directly on the server side and have A href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt but then why don't you get rid of these nasty umlauts at all? Why not use only normal latin letters, or, in case you heavily use numeric ids already, use only numeric ids? Anton - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
This is in a JSP page (which of course becomes a servlet). Do I have to set the encoding in Tomcat perhaps? -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: Thursday, September 04, 2003 5:16 PM To: '[EMAIL PROTECTED]' Subject: Character Encoding problem (umlauts, etc). I have a servlet that catches a request for a file. But if that file has characters such as an umlaut in it (for example: ä), the path info is all wrong. For example: I am requesting file : /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt but what is coming across in the request is: /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt I have tried: String requestPathInfo5 = new String(request.getPathInfo().getBytes(ISO-8859-1)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(Unicode)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UTF8)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UnicodeLittle)); But none of them are returning correctly. Does anyone know what the correct know what is the correct unicode encoding I should have? Any other suggestions? I know this problem has been solved before so If you could point me in the direction of the solution on the web that is fine. THanks in advance. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Character Encoding problem (umlauts, etc).
The FAQ ( http://jakarta.apache.org/tomcat/faq ) has a link to a thread on How to UTF-8 your site, which I think might be similar. http://marc.theaimsgroup.com/?l=tomcat-userm=105524426515137w=2 is the link to the thread itself. Try some of the things there and see if they work for you. (specifically, starting Tomcat with a -Dfile.encoding=UTF-8 switch) Jeff Tulley ([EMAIL PROTECTED]) (801)861-5322 Novell, Inc., The Leading Provider of Net Business Solutions http://www.novell.com [EMAIL PROTECTED] 9/4/03 3:24:58 PM This is in a JSP page (which of course becomes a servlet). Do I have to set the encoding in Tomcat perhaps? -Original Message- From: Robert Priest [mailto:[EMAIL PROTECTED] Sent: Thursday, September 04, 2003 5:16 PM To: '[EMAIL PROTECTED]' Subject: Character Encoding problem (umlauts, etc). I have a servlet that catches a request for a file. But if that file has characters such as an umlaut in it (for example: ä), the path info is all wrong. For example: I am requesting file : /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt but what is coming across in the request is: /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt I have tried: String requestPathInfo5 = new String(request.getPathInfo().getBytes(ISO-8859-1)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(Unicode)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UTF8)); String requestPathInfo5 = new String(request.getPathInfo().getBytes(UnicodeLittle)); But none of them are returning correctly. Does anyone know what the correct know what is the correct unicode encoding I should have? Any other suggestions? I know this problem has been solved before so If you could point me in the direction of the solution on the web that is fine. THanks in advance. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]