RE: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Robert Priest
Thanks for the information Anton. But just getting rid of umlauts or other
international characters is not an option when you have clients that use
your software in other countries, that have those special characters. We
cannot rename user files or changed that data. That would be very, very, bad
:)

-Original Message-
From: Anton Tagunov [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 06, 2003 5:46 AM
To: Tomcat Users List
Subject: Re: Character Encoding problem (umlauts, etc).


Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Thomas Kellerer
Robert Priest schrieb:

I have a servlet that catches a request for a file.

How is the request sent?

If sent via an HTML form, you need to include the accept-charset=UTF-8 
attribute into your form tag

Thomas



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Bodycombe, Andrew
This problem can usually be fixed by changing the file.encoding system
property.
Set CATALINA_OPTS to -Dfile.encoding=utf-8 (or iso-8859-1 or whatever
character set you like) and restart tomcat

Hope this helps
Andy

-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED] 
Sent: 08 September 2003 14:18
To: 'Tomcat Users List'
Subject: RE: Character Encoding problem (umlauts, etc).


Thanks for the information Anton. But just getting rid of umlauts or other
international characters is not an option when you have clients that use
your software in other countries, that have those special characters. We
cannot rename user files or changed that data. That would be very, very, bad
:)

-Original Message-
From: Anton Tagunov [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 06, 2003 5:46 AM
To: Tomcat Users List
Subject: Re: Character Encoding problem (umlauts, etc).


Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Character Encoding problem (umlauts, etc).

2003-09-06 Thread Anton Tagunov
Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Character Encoding problem (umlauts, etc).

2003-09-04 Thread Robert Priest
This is in a JSP page (which of course becomes a servlet).

Do I have to set the encoding in Tomcat perhaps?



-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 04, 2003 5:16 PM
To: '[EMAIL PROTECTED]'
Subject: Character Encoding problem (umlauts, etc).


 I have a servlet that catches a request for a file.
 
 But if that file has characters such as an umlaut in it (for example: ä),
 the path info is all wrong.
 
 For example:  I am requesting file : 
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
 
 but what is coming across in the request is:
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt
 
 
 I have tried:
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(ISO-8859-1));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(Unicode));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UTF8));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UnicodeLittle));
 
 
 But none of them are returning correctly.
 
 Does anyone know what the correct know what is the correct unicode
 encoding I should have?
 
 Any other suggestions?
 
 I know this problem has been solved before so If you could point me in the
 direction of the solution on the web that is fine.
 
 THanks in advance.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Character Encoding problem (umlauts, etc).

2003-09-04 Thread Jeff Tulley
The FAQ ( http://jakarta.apache.org/tomcat/faq ) has a link to a thread on How to 
UTF-8 your site, which I think might be similar.  
http://marc.theaimsgroup.com/?l=tomcat-userm=105524426515137w=2
is the link to the thread itself.  Try some of the things there and see if they work 
for you. (specifically, starting Tomcat with a -Dfile.encoding=UTF-8 switch)

Jeff Tulley  ([EMAIL PROTECTED])
(801)861-5322
Novell, Inc., The Leading Provider of Net Business Solutions
http://www.novell.com

 [EMAIL PROTECTED] 9/4/03 3:24:58 PM 
This is in a JSP page (which of course becomes a servlet).

Do I have to set the encoding in Tomcat perhaps?



-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 04, 2003 5:16 PM
To: '[EMAIL PROTECTED]'
Subject: Character Encoding problem (umlauts, etc).


 I have a servlet that catches a request for a file.
 
 But if that file has characters such as an umlaut in it (for example: ä),
 the path info is all wrong.
 
 For example:  I am requesting file : 
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
 
 but what is coming across in the request is:
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt
 
 
 I have tried:
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(ISO-8859-1));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(Unicode));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UTF8));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UnicodeLittle));
 
 
 But none of them are returning correctly.
 
 Does anyone know what the correct know what is the correct unicode
 encoding I should have?
 
 Any other suggestions?
 
 I know this problem has been solved before so If you could point me in the
 direction of the solution on the web that is fine.
 
 THanks in advance.

-
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 

-
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]