SSI Servlet Character Encoding Problem

2005-09-14 Thread KEREM ERKAN
Hi people,
 
I use Tomcat 5.5.9 with Apache 2.0.54 and jk 1.2.10 to serve my websites. I
want to set custom error pages to be served when an error like 404, 500 etc.
occurs. The website uses the iso-8859-9 character set on every page, and the
error pages are encoded with iso-8859-9 too.
 
Only *.jsp pages and servlets are mapped to Tomcat through jk. All other
static content is served by Apache.
 
My error pages are all Server Side Include shtml files. These error files
include some jsp files with:
 
!--#include virtual=file-name --
 
I have set up SSI (correctly I hope) on both Apache and Tomcat, and also set
up custom error file handling on both httpd.conf and the default web.xml for
the site I want to customize.
 
On my server.xml file, my site is configured like this:
 
Host name= http://www.foo.com www.foo.com appBase=/usr/www/foo
  Context path= docBase=/
  Context path=/servlet docBase=servlet/
/Host
 
On my default web.xml for tomcat configuration, SSI servlet has these
directives added:
 
init-param
  param-nameinputEncoding/param-name
  param-valueiso-8859-9/param-value
/init-param
init-param
  param-nameoutputEncoding/param-name
  param-valueiso-8859-9/param-value
/init-param
 
When I go to some nonexistent foo.html on my website, Apache handles the
customized error page and everything works fine. On the other hand I have
two problems:
 
One is, if the isVirtualWebappRelative variable in web.xml is set to 0,
and I go to a nonexistent foo.jsp on my server, jk sends the request to
Tomcat and when Tomcat tries to handle the customized error page, it cannot
find the included files in SSI and an error is written in the Tomcat error
log like:
 
SEVERE: ssi: #include--Couldn't include file: /include/footer.jsp
java.io.IOException: Couldn't get context for path: /include/footer.jsp

The /include folder is in the server root. I think it should be able to find
these pages, as it looks at the server root because of the state of
isVirtualWebappRelative variable, am I wrong?
 
When I set isVirtualWebappRelative to 1, this problem is solved only for
server root, my servlets (say /servlet) still cannot get the customized page
includes, because this time the SSI includes become web app relative and I
have to copy the /include folder into /servlet/include directory to work
around this problem.
 
My second problem is, when I go to this nonexistent foo.jsp with
isVirtualWebappRelative set to 1, and when Tomcat tries to handle the
error page, the encoding is always UTF-8 regardless of inputEncoding or
outputEncoding variables. So my error page becomes full of garbled
characters because the encoding should be iso-8859-9.
 
Is there any suggestions about these problems?
 
Best regards,
 
Kerem


Tomcat 5.5 character encoding problem

2004-09-02 Thread Michael Schuerig

Hello!

I've tried my luck with Tomcat 5.5 and found that it behaves different 
than 5.0.27 does as regards character encoding. Somehow the conversion 
from ISO-8859-1 to UTF-8 doesn't work as it should. This may well be 
due to a misconfiguration. See below for the JSP document and the two 
different output pages generated. It looks as if Tomcat 5.5 applies the 
ISO-8859-1 to UTF-8 conversion twice. I'm editing the JSP document with 
Eclipse 3.1M1 and the text file encoding is set to ISO-8859-1.

I'm still pretty new to servlets, JSP, and Tomcat, so I might be 
overlooking something rather obvious.

Michael


test.jspx:

?xml version=1.0 encoding=ISO-8859-1?

html xmlns=http://www.w3.org/1999/xhtml;
 xmlns:jsp=http://java.sun.com/JSP/Page;
 xmlns:c=http://java.sun.com/jsp/jstl/core;
 xmlns:fmt=http://java.sun.com/jsp/jstl/fmt;
 lang='de'


  jsp:output doctype-root-element=html
   doctype-public=-//W3C//DTD XHTML Strict 1.0//EN
   doctype-system=http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd; /

  jsp:directive.page contentType=text/html /
  jsp:directive.page session=false /

  head
titleTest/title
  /head

  body

br/
br/
br/
br/
br/
br/
#160;br/

  /body
/html


With Tomcat 5.5 (wrong)
http://localhost:8080/myapp/test.jspx:

?xml version=1.0 encoding=UTF-8?
!DOCTYPE html PUBLIC -//W3C//DTD XHTML Strict 1.0//EN 
http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd;
html xmlns=http://www.w3.org/1999/xhtml; 
lang=deheadtitleTest/title/headbody

br/
br/
br/
br/
br/
br/
br//body/html


With Tomcat 5.0.27 (correct)
http://localhost:8080/myapp/test.jspx:

?xml version=1.0 encoding=UTF-8?
!DOCTYPE html PUBLIC -//W3C//DTD XHTML Strict 1.0//EN 
http://www.w3.org/TR/xhtml-basic/xhtml1-strict.dtd;
html xmlns=http://www.w3.org/1999/xhtml; 
lang=deheadtitleTest/title/headbody

br/
br/
br/
br/
br/
br/
br//body/html


-- 
Michael SchuerigI was blessed with a birth and a death and
mailto:[EMAIL PROTECTED] I guess I just want some say in between
http://www.schuerig.de/michael/ --Ani DiFranco, Talk To Me Now

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Robert Priest
Thanks for the information Anton. But just getting rid of umlauts or other
international characters is not an option when you have clients that use
your software in other countries, that have those special characters. We
cannot rename user files or changed that data. That would be very, very, bad
:)

-Original Message-
From: Anton Tagunov [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 06, 2003 5:46 AM
To: Tomcat Users List
Subject: Re: Character Encoding problem (umlauts, etc).


Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Thomas Kellerer
Robert Priest schrieb:

I have a servlet that catches a request for a file.

How is the request sent?

If sent via an HTML form, you need to include the accept-charset=UTF-8 
attribute into your form tag

Thomas



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Bodycombe, Andrew
This problem can usually be fixed by changing the file.encoding system
property.
Set CATALINA_OPTS to -Dfile.encoding=utf-8 (or iso-8859-1 or whatever
character set you like) and restart tomcat

Hope this helps
Andy

-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED] 
Sent: 08 September 2003 14:18
To: 'Tomcat Users List'
Subject: RE: Character Encoding problem (umlauts, etc).


Thanks for the information Anton. But just getting rid of umlauts or other
international characters is not an option when you have clients that use
your software in other countries, that have those special characters. We
cannot rename user files or changed that data. That would be very, very, bad
:)

-Original Message-
From: Anton Tagunov [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 06, 2003 5:46 AM
To: Tomcat Users List
Subject: Re: Character Encoding problem (umlauts, etc).


Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re[2]: Character Encoding problem (umlauts, etc).

2003-09-08 Thread Anton Tagunov
Hello Robert!

RP Thanks for the information Anton. But just getting rid of umlauts or other
RP international characters is not an option when you have clients that use
RP your software in other countries, that have those special characters. We
RP cannot rename user files or changed that data. That would be very, very, bad
RP :)

Well, getting rid of them was only my second suggestion.
The first one was that we should make sure they all
get %xy encoded in the url.

So, first of all, have you detected if it
is your browser or Tomcat to blame?

I advice you to spy the traffic and see for yourself what
you browser is sending.

Is the browser sending non-ascii chars as '?'-s?
Is the browser sending them as raw 8-bit text?
Is the browser %xy-encoding them?

(The last will be the effect we want.
All other don't meet our needs).

As I've said before the methods to spy these (the methods that I
have know, there should be others :-) are on my page
at tagunov.tripod.com. If you can supply this info -- what is
the browser actually sending -- we will be able to move further
on with your needs.

Good luck!

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Character Encoding problem (umlauts, etc).

2003-09-06 Thread Anton Tagunov
Hello Robert!

Robert Priest [EMAIL PROTECTED] wrote:
RP I am requesting file :
RP /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
RP but what is coming across in the request is:
RP /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt

Probably your browser is sending it that way?
I guess it is a bad idea anyways to type anything nasty
in the browser URL input line.

You may try to spy your interaction between browser and
server, I have described how to do it in one of the sections
of my ancient http://tagunov.tripod.com, try to find it there,
then you'll know for sure what bytes are sent by browser.

I guess that it is generally a bad idea to have anything
nasty in the url at all. The closest you could get would be
to encode it all as %AD and etc. But then you should be
sure what encoding this is (utf-8 or anything).

So, if these are links from your HTML page, why don't you
encode all in the url directly on the server side and
have A
href=context/38CF278C0186B466222FC48571080B83/51/dms00051/%88%AA.txt

but then why don't you get rid of these nasty umlauts at all?

Why not use only normal latin letters, or, in case you heavily use
numeric ids already, use only numeric ids?

Anton


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Character Encoding problem (umlauts, etc).

2003-09-04 Thread Robert Priest
 I have a servlet that catches a request for a file.
 
 But if that file has characters such as an umlaut in it (for example: ä),
 the path info is all wrong.
 
 For example:  I am requesting file : 
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
 
 but what is coming across in the request is:
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt
 
 
 I have tried:
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(ISO-8859-1));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(Unicode));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UTF8));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UnicodeLittle));
 
 
 But none of them are returning correctly.
 
 Does anyone know what the correct know what is the correct unicode
 encoding I should have?
 
 Any other suggestions?
 
 I know this problem has been solved before so If you could point me in the
 direction of the solution on the web that is fine.
 
 THanks in advance.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Character Encoding problem (umlauts, etc).

2003-09-04 Thread Robert Priest
This is in a JSP page (which of course becomes a servlet).

Do I have to set the encoding in Tomcat perhaps?



-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 04, 2003 5:16 PM
To: '[EMAIL PROTECTED]'
Subject: Character Encoding problem (umlauts, etc).


 I have a servlet that catches a request for a file.
 
 But if that file has characters such as an umlaut in it (for example: ä),
 the path info is all wrong.
 
 For example:  I am requesting file : 
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
 
 but what is coming across in the request is:
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt
 
 
 I have tried:
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(ISO-8859-1));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(Unicode));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UTF8));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UnicodeLittle));
 
 
 But none of them are returning correctly.
 
 Does anyone know what the correct know what is the correct unicode
 encoding I should have?
 
 Any other suggestions?
 
 I know this problem has been solved before so If you could point me in the
 direction of the solution on the web that is fine.
 
 THanks in advance.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Character Encoding problem (umlauts, etc).

2003-09-04 Thread Jeff Tulley
The FAQ ( http://jakarta.apache.org/tomcat/faq ) has a link to a thread on How to 
UTF-8 your site, which I think might be similar.  
http://marc.theaimsgroup.com/?l=tomcat-userm=105524426515137w=2
is the link to the thread itself.  Try some of the things there and see if they work 
for you. (specifically, starting Tomcat with a -Dfile.encoding=UTF-8 switch)

Jeff Tulley  ([EMAIL PROTECTED])
(801)861-5322
Novell, Inc., The Leading Provider of Net Business Solutions
http://www.novell.com

 [EMAIL PROTECTED] 9/4/03 3:24:58 PM 
This is in a JSP page (which of course becomes a servlet).

Do I have to set the encoding in Tomcat perhaps?



-Original Message-
From: Robert Priest [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 04, 2003 5:16 PM
To: '[EMAIL PROTECTED]'
Subject: Character Encoding problem (umlauts, etc).


 I have a servlet that catches a request for a file.
 
 But if that file has characters such as an umlaut in it (for example: ä),
 the path info is all wrong.
 
 For example:  I am requesting file : 
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/äää.txt
 
 but what is coming across in the request is:
 
 /38CF278C0186B466222FC48571080B83/51/dms00051/???.txt
 
 
 I have tried:
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(ISO-8859-1));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(Unicode));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UTF8));
 String requestPathInfo5 = new
 String(request.getPathInfo().getBytes(UnicodeLittle));
 
 
 But none of them are returning correctly.
 
 Does anyone know what the correct know what is the correct unicode
 encoding I should have?
 
 Any other suggestions?
 
 I know this problem has been solved before so If you could point me in the
 direction of the solution on the web that is fine.
 
 THanks in advance.

-
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 

-
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Character Encoding problem

2002-08-30 Thread Irina Lishchenko

On Wednesday 28 August 2002 13:17, you wrote:
 Hi

 I am using tomact 4.0.4 and JDK1.3.1

 I am trying to read parameter in hebrew from the URL but get '???' writing
 Hebrew to the browser works fine

 I can not use req.setCharacterEncoding(java.lang.String env) (can not
 compile the code when I am using it)

 is there a way to go around it. I am very flexiable in choosing the JDK and
 TOMCAT version to work with but they need to be release version and not
 beta or something like this


You can use followed form just with right encoding for you, W3C has foreseen 
the atribute accept-charset for element/tag form. Then your request object 
will have right encoding too


form action=action.jsp method=POST accept-charset=ISO-8859-5
~~~
...

/form

ilis

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Character Encoding problem

2002-08-29 Thread Bill Barker


Nehemia Litterat [EMAIL PROTECTED] wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...

 Hi

 I am using tomact 4.0.4 and JDK1.3.1

 I am trying to read parameter in hebrew from the URL but get '???' writing
Hebrew to the browser works fine

 I can not use req.setCharacterEncoding(java.lang.String env) (can not
compile the code when I am using it)


This is almost certainly due to having an older version of servlet.jar in
your classpath.  One especial got-ya is to have on older version of j2ee.jar
and/or servlet.jar in $JAVA_HOME/jre/lib/ext.  If this is the case, kill it.
Otherwise, try compiling with:
javac -classpath $CATALINA_HOME/common/lib/servlet.jar:$CLASSPATH
MyServlet.java

 is there a way to go around it. I am very flexiable in choosing the JDK
and TOMCAT version to work with but they need to be release version and not
beta or something like this

blatant-plug
Tomcat 3.3.1 has excellent charset support, especially if you are willing to
use it's non-portable features.
/blatant-plug
However, once you solve your compilation problem, 4.x should do everything
that you need it to do (and portably as well :).

 Thanks in advance

 Nehemia Litterat





 -
 Do You Yahoo!?
 Yahoo! Finance - Get real-time stock quotes





--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Character Encoding problem

2002-08-28 Thread Nehemia Litterat


Hi 

I am using tomact 4.0.4 and JDK1.3.1

I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to 
the browser works fine

I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the code 
when I am using it)

is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT 
version to work with but they need to be release version and not beta or something 
like this 

Thanks in advance

Nehemia Litterat

 



-
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes


Re: Character Encoding problem

2002-08-28 Thread Fabio Mengue

Perhaps something like this is tomcat/bin/setclasspath.sh

JAVA_OPTS=-Dfile.encoding=ISO-8859-1

(with you ISO configuration, of course).

Fabio.

Nehemia Litterat wrote:

Hi 

I am using tomact 4.0.4 and JDK1.3.1

I am trying to read parameter in hebrew from the URL but get '???' writing Hebrew to 
the browser works fine

I can not use req.setCharacterEncoding(java.lang.String env) (can not compile the 
code when I am using it)

is there a way to go around it. I am very flexiable in choosing the JDK and TOMCAT 
version to work with but they need to be release version and not beta or something 
like this 

Thanks in advance

Nehemia Litterat

 



-
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
  


-- 
Fabio Mengue - Centro de Computacao - Unicamp
[EMAIL PROTECTED]   [EMAIL PROTECTED]
Quem se mata de trabalhar merece mesmo morrer. - Millor



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




character encoding problem

2002-07-18 Thread Winter, Wolfgang

hi,

since two weeks I'm fighting with a character encoding problem without
success: I send a JSP form to a Tomcat 4.03 servlet and log the form input
with log4j into a file and into a database. Running on a german server with
Suse Linux everything works fine. Now I installed Tomcat and servlet on a
Slackware server in the US and all special characters are logged as a
question mark. I tried everything I could get hands on in the mailing list
archive, without success:

- %@ page contentType=text/html;charset=iso-8859-1 % in the JSP

- meta http-equiv=Content-Type value=text/html; charset=iso-8859-1 in
the JSP

- start tomcat with CATALINA_OPTS=-Dfile.encoding=iso-8859-1

- set environment variables LC_ALL=de;export LC_ALL and LANG=de;export LANG
(locale -a on slackware yields de, deutsch or german)

- request.setCharacterEncoding(iso-8859-1); as first statement in
servlet.doPost()

- convert form parameters like: 
 byte[] bytes = param.getBytes(iso-8859-1);
 String convertedParam = new String( bytes, iso-8859-1 );

I tried everything with iso-8859-1 and with UTF-8. 
With a perl script I have no problems handling the form input, so it must be
a Java/Tomcat problem. What else could I do to change the platform's default
encoding? 

please help
Wolfgang




--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Character encoding problem: strange rash

2002-04-15 Thread John Wadkin

All,

Tomcat 4.0.1
Apache 1.3
WARP
Solaris 8
JDK 1.2/1.3

Does anyone know why a servlet would suddenly start displaying non-breaking
spaces (#160;) as question marks (?) when the JDK/SDK is upgraded from 1.2
to 1.3? Very odd behaviour! Like a rash - question marks all over the place
:)
The servlet was written for a version of the Servlet API 2.3, so it doesn't
set the character encoding. BUT I don't see how this would have any bearing
on the problem, since the JDK has nothing to do with the Servlet API?

Thanks,

John

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




How to solve the character encoding problem with JSP

2001-07-10 Thread Andre Tampold


Hello all Tomcat users!

I'm new to Tomcat. Installed it yesterday and it seems to me running very
well.

Only one question is arised, which probably is answered already, but I did
not found answer for that from archive.

Currently in production we have JavaWebServer2.0 / Win NT, JDK1.3 and this
works very well for solving encoding problem with JSP:


///
//
//  How to solve the character encoding problem with JSP
//  
//
// 1. Comment out this line in generated file's _jspService method:
// JspWriter out = null;

// 2. Insert this line where you commented out number 1:
// PrintWriter out = response.getWriter();

// 3. Comment out this line in generated file's _jspService method:
// out = pageContext.getOut();

// 4. At the end of the _jspService method you find:
// } catch (Throwable t) {
//  * if (out.getBufferSize() != 0)
//  * out.clear();
//  Lines marked with * shall be commented out.
//
// 5. After lines mentioned in point 4 you find:
//  HandleErrorPageException("...errorpage.jsp", t, out);
//Change it to:
//  HandleErrorPageException("...errorpage.jsp", t,
pageContext.getOut());

///

Actually there are some Lexicon files in Cp1251, Cp1257 and UTF-8 encodings,
which need to come through JSP to browser. With JavaWebServer2.0 this is ok,
if we do as described in steps 1.-5.

How to do it with Tomcat? I did the same but it seems to me no effect at
all. Only  marks or unreadable characters. What is missing?

Thanks in advance for any kind of info/URL/help,

Andre Tampld
Software Development @ EMK




Character Encoding Problem

2001-07-02 Thread atumer

Hi,

I know that this is a popular (!?) problem in tomcat. Dur despite my efforts I could 
not find any solution. Here it goes:

We have jsp page in encoding type ISO-8859-9. With the line
%@ page contentType = "text/html; charset=ISO-8859-9" %
we define the encoding type of the document.

Result:
The strings in the jsp file are encoded correctly, like
- The text in html
- the text written inside jsp code between %, %

Problem:
The strings that are recevied from a class, read from a file, inside a constant are 
not encoded correctly.

%= "Pretend to be a ISO8859-9 string" % is encoded correctly but,
inside a class let's say we have
String x = "Pretend to be a ISO8859-9 string";
then
%= myClass.c % is encoded wrong with many ?'s

The problem is not in JVM/JDK, I think, as I read a file and write the content into 
another file, both files are the same with no loss in encoding.

Also the encoding problem is there when I include files with
jsp:include ..
If I include the file with %@ include .. the problem is
not there .. But I really need to use jsp:include ..

Can anyone solve my problem ?? ..

Arif Tumer ..




RE: Character Encoding Problem

2001-07-02 Thread Tõnu Põld

Hi,

You should compile the java classes with ISO-8859-9 encoding.
Look at the -encoding flag of the 'javac' compiler.

In compilation the 8-byte characters in strings are converted to unicode
characters.
By default the encoding is probably ISO-8859-1.

Regards,
Tõnu


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Monday, July 02, 2001 10:33 AM
 To: [EMAIL PROTECTED]
 Subject: Character Encoding Problem
 
 
 Hi,
 
 I know that this is a popular (!?) problem in tomcat. Dur 
 despite my efforts I could not find any solution. Here it goes:
 
 We have jsp page in encoding type ISO-8859-9. With the line
 %@ page contentType = text/html; charset=ISO-8859-9 %
 we define the encoding type of the document.
 
 Result:
 The strings in the jsp file are encoded correctly, like
 - The text in html
 - the text written inside jsp code between %, %
 
 Problem:
 The strings that are recevied from a class, read from a file, 
 inside a constant are not encoded correctly.
 
 %= Pretend to be a ISO8859-9 string % is encoded correctly but,
 inside a class let's say we have
 String x = Pretend to be a ISO8859-9 string;
 then
 %= myClass.c % is encoded wrong with many ?'s
 
 The problem is not in JVM/JDK, I think, as I read a file and 
 write the content into another file, both files are the same 
 with no loss in encoding.
 
 Also the encoding problem is there when I include files with
 jsp:include ..
 If I include the file with %@ include .. the problem is
 not there .. But I really need to use jsp:include ..
 
 Can anyone solve my problem ?? ..
 
 Arif Tumer ..
 



lgi:RE: Character Encoding Problem

2001-07-02 Thread atumer

 Kimden: Tnu Pld [EMAIL PROTECTED]
 Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00
 Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED]
 Konu: RE: Character Encoding Problem
 
 Hi,
 
 You should compile the java classes with ISO-8859-9 encoding.
 Look at the -encoding flag of the 'javac' compiler.
 
 In compilation the 8-byte characters in strings are converted to unicode
 characters.
 By default the encoding is probably ISO-8859-1.
 
 Regards,
 Tnu
 

I tried, a few minutes ago, but the problem remains. I don't
know the usage of -encoding parameter in detail but,
I think it is related with the string constants in the 
source files. The main problem is the encoding when I read
a text file and show its contents in the jsp with something like,

%= FileReader.readLine() %

And as I mentioned, when I load a file and write it into another file, the encoding is 
correct. The result of my string manipulation functions are correct. The single, 
devastating, problem occurs when I pass a string with characters in ISO-8859-9, from a 
class to jsp.

Ah, jsp:include also still is a problem .. :(

Himm .. It is a stupid thought but is it something to do with dynamic strings vs. 
static strings. As far as I know when %@ include.. is used the file is statically 
included,
at compile time, but jsp:include.. may include the file dynamically after 
compilation. Also the string constants in jsp files are compiled, but strings read 
from files are
read and generated in execution time .. Himmm .. I think I am going paranoic :) ..

Thanks, anyway .. Has anyone encountered this kind of problem before ??

Arif ..




RE: Character Encoding Problem

2001-07-02 Thread ohamali

Hi,

  Another problem related with the charset type is when I use the following code
 
  strErrorMsg = "a message using ISO-8859-9";
  INPUT TYPE="HIDDEN" Name="errormsg" 
 VALUE="=strErrorMsg"

and post the form, the receiving jsp file does not print the
strErroMsg variable correctly.

How can this problem be solved?

Thanks,

Oner Necip Hamali.
 
 Kimden: [EMAIL PROTECTED]
 Tarih: 2001/07/02 Mon PM 12:16:03 GMT+03:00
 Kime: [EMAIL PROTECTED]
 Konu: lgi:RE: Character Encoding Problem
 
  Kimden: Tnu Pld [EMAIL PROTECTED]
  Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00
  Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED]
  Konu: RE: Character Encoding Problem
  
  Hi,
  
  You should compile the java classes with ISO-8859-9 encoding.
  Look at the -encoding flag of the 'javac' compiler.
  
  In compilation the 8-byte characters in strings are converted to unicode
  characters.
  By default the encoding is probably ISO-8859-1.
  
  Regards,
  Tnu
  
 
 I tried, a few minutes ago, but the problem remains. I don't
 know the usage of -encoding parameter in detail but,
 I think it is related with the string constants in the 
 source files. The main problem is the encoding when I read
 a text file and show its contents in the jsp with something like,
 
 %= FileReader.readLine() %
 
 And as I mentioned, when I load a file and write it into another file, the encoding 
is correct. The result of my string manipulation functions are correct. The single, 
devastating, problem occurs when I pass a string with characters in ISO-8859-9, from 
a class to jsp.
 
 Ah, jsp:include also still is a problem .. :(
 
 Himm .. It is a stupid thought but is it something to do with dynamic strings vs. 
static strings. As far as I know when %@ include.. is used the file is statically 
included,
 at compile time, but jsp:include.. may include the file dynamically after 
compilation. Also the string constants in jsp files are compiled, but strings read 
from files are
 read and generated in execution time .. Himmm .. I think I am going paranoic :) ..
 
 Thanks, anyway .. Has anyone encountered this kind of problem before ??
 
 Arif ..
 
 




RE: ?lgi:RE: Character Encoding Problem

2001-07-02 Thread Tõnu Põld

Just a thought about the jsp:include... problem:
try to place the %@page contentType= in each file you include?


When reading bytes from file with FileReader the default character encoding
is used.
I think you must specify your own encoding when reading the file.

The suns javadoc says about FileReader class:
Convenience class for reading character files. The constructors of this
class assume that the default character encoding and the default byte-buffer
size are appropriate. To specify these values yourself, construct an
InputStreamReader on a FileInputStream. 

Tõnu

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Monday, July 02, 2001 11:16 AM
 To: [EMAIL PROTECTED]
 Subject: ?lgi:RE: Character Encoding Problem
 
 
  Kimden: Tõnu Põld [EMAIL PROTECTED]
  Tarih: 2001/07/02 Mon AM 10:48:10 GMT+03:00
  Kime: '[EMAIL PROTECTED]' 
 [EMAIL PROTECTED]
  Konu: RE: Character Encoding Problem
  
  Hi,
  
  You should compile the java classes with ISO-8859-9 encoding.
  Look at the -encoding flag of the 'javac' compiler.
  
  In compilation the 8-byte characters in strings are 
 converted to unicode
  characters.
  By default the encoding is probably ISO-8859-1.
  
  Regards,
  Tõnu
  
 
 I tried, a few minutes ago, but the problem remains. I don't
 know the usage of -encoding parameter in detail but,
 I think it is related with the string constants in the 
 source files. The main problem is the encoding when I read
 a text file and show its contents in the jsp with something like,
 
 %= FileReader.readLine() %
 
 And as I mentioned, when I load a file and write it into 
 another file, the encoding is correct. The result of my 
 string manipulation functions are correct. The single, 
 devastating, problem occurs when I pass a string with 
 characters in ISO-8859-9, from a class to jsp.
 
 Ah, jsp:include also still is a problem .. :(
 
 Himm .. It is a stupid thought but is it something to do with 
 dynamic strings vs. static strings. As far as I know when %@ 
 include.. is used the file is statically included,
 at compile time, but jsp:include.. may include the file 
 dynamically after compilation. Also the string constants in 
 jsp files are compiled, but strings read from files are
 read and generated in execution time .. Himmm .. I think I am 
 going paranoic :) ..
 
 Thanks, anyway .. Has anyone encountered this kind of problem 
 before ??
 
 Arif ..
 



RE: Character Encoding Problem

2001-07-02 Thread atumer

 Kimden: Tnu Pld [EMAIL PROTECTED]
 Tarih: 2001/07/02 Mon AM 11:55:51 GMT+03:00
 Kime: "'[EMAIL PROTECTED]'" [EMAIL PROTECTED]
 Konu: RE: ?lgi:RE: Character Encoding Problem
 
 Just a thought about the jsp:include... problem:
 try to place the "%@page contentType=" in each file you include?

The line is there, in all files :) ..

 
 
 When reading bytes from file with FileReader the default character encoding
 is used.
 I think you must specify your own encoding when reading the file.


I'll try that. But the same compiled classes and the same jdk version works well with 
Resin JSP Server and the files. The problem occurs with Tomcat.

Thanks,

Arif ..




RE: Character Encoding Problem

2001-07-02 Thread atumer

  When reading bytes from file with FileReader the default character encoding
  is used.
  I think you must specify your own encoding when reading the file.
 
 
 I'll try that. But the same compiled classes and the same jdk version works well 
with Resin JSP Server and the files. The problem occurs with Tomcat.
 

Nay, still problems. A lot of ?'s in the visual output. Himpf. The point is, which is 
interesting, if a have an html form with ISO-8859-9 encoded chars, and post it to a 
jsp file, and write the parameters to a system text file, the text is correct!!

The problem seems to occur during displaying strings obtained from inside a class .. 
Any other display problems with other character sets ?? .. Any more idea ?? ..

Let's remember the problem. We cannot display ISO-8859-9 encoded string constants of a 
class, or strings read from a file, in jsp documents in correct encoding.
Usage of %@ page content-type does not solve
Usage of javac -encoding does not solve
Usage of encodings in file read/writes of java.io routines does not solve

The problem is with Tomcat, no problems with Resin used in the same environment, 
OS/JDK.

Arif ..




RE: Character Encoding Problem

2001-07-02 Thread Tõnu Põld

Hi,

I still believe your initial bytes are converted to java strings (unicode)
using a wrong encoding.

If you have a string created from bytes using the ISO-8859-9 encoding, and
if the JSP page has a directive %@ page content-type=ISO-8859-9%, then
it should be OK. 

For debuging you could try to convert your string to another encoding, look
what happens.
For example:

%@ page content-type=ISO-8859-9%
String s = new String( initalString.getBytes(ISO-8859-1), ISO-8859-9);
%= s %

If this dislays your string correctly, then you have used the ISO-8859-1
encoding in creation of a java string from inital bytes!

By the way which version of Tomcat are you using. An older release (3.2.1)
had some bugs with encoding conversion. Try the latest 3.2.2 release.

The request parameters from HTTP post are probably in ISO-8859-1 encoding
because most browsers do not specify the encoding when submiting a request,
so Tomcat uses the default encoding. To convert them correctly to java
strings encoding, the following could be used (assuming that they really are
ISO-8859-9):
String param = new String( initalParam.getBytes(ISO-8859-1),
ISO-8859-9);

Regards,
Tõnu



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Monday, July 02, 2001 3:34 PM
 To: [EMAIL PROTECTED]
 Subject: RE: Character Encoding Problem
 
 
   When reading bytes from file with FileReader the default 
 character encoding
   is used.
   I think you must specify your own encoding when reading the file.
  
  
  I'll try that. But the same compiled classes and the same 
 jdk version works well with Resin JSP Server and the files. 
 The problem occurs with Tomcat.
  
 
 Nay, still problems. A lot of ?'s in the visual output. 
 Himpf. The point is, which is interesting, if a have an html 
 form with ISO-8859-9 encoded chars, and post it to a jsp 
 file, and write the parameters to a system text file, the 
 text is correct!!
 
 The problem seems to occur during displaying strings obtained 
 from inside a class .. Any other display problems with other 
 character sets ?? .. Any more idea ?? ..
 
 Let's remember the problem. We cannot display ISO-8859-9 
 encoded string constants of a class, or strings read from a 
 file, in jsp documents in correct encoding.
 Usage of %@ page content-type does not solve
 Usage of javac -encoding does not solve
 Usage of encodings in file read/writes of java.io routines 
 does not solve
 
 The problem is with Tomcat, no problems with Resin used in 
 the same environment, OS/JDK.
 
 Arif ..