Re: Encoding in Tomcat 6

2007-05-22 Thread hkml
Georg Sauer-Limbach wrote:
 I do not think it is very obvious, that the response class is writing
 the characters using the platform's default encoding in this case
 
 Yes. And this is true for many, many places in the
 Java library. Always watch out if you see some
 String being processed using a Stream.
Yes, in general I take care of that, but in this case:
The response (who knows what encoding I want) gives me a special stream
where I find a method println( String s ). Why on earth should they
guess a character encoding for character output then.
Nevertheless: they said what they did in the apidoc, so it must be okay.
Strange enough, that it worked correctly in older Tomcat versions.

 The ServletOutputStream shouldn't have all these
 print methods, at least not the one for String.
The word deprecated comes to my mind :-)

Cheers and thanks again,
Heinz

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Encoding in Tomcat 6

2007-05-22 Thread hkml
uzi wrote:
 i liked this article regarding encoding:
 http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/index.html
Thanks for the hint. Looks nice.

Cheers,
Heinz

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Encoding in Tomcat 6

2007-05-21 Thread hkml
Hi all,

I noticed some encoding problems inside servlets, when switching from
Tomcat 5.5.20 to Tomcat 6.0.10. I looked for it in the mailing lists,
but didn't find something appropriate.


Scenario:
An own servlet (that is: a class derived from HttpServlet) is creating
very simple HTML output, containing (beside the necessary HTML tags
html,body etc.) just some German special characters (ä ö ü).

The java source code is UTF-8, the response instance is configured via
  response.setContentType( text/html;charset=UTF-8 );
Just for safety I also added
  response.setCharacterEncoding( UTF-8 );

The created HTML text contains a meta tag
  meta http-equiv=Content-Type content=text/html; charset=utf-8 /

Nevertheless: when calling the corresponding URL, all the special
characters are not displayed correctly in the browser (Firefox), when
using Tomcat 6. If I switch the encoding of the displayed page to
ISO-8859-1 in Firefox the characters are displayed correctly. That is:
it seems to me that everything is okay with the servlet, except that the
used encoding for the response is ISO-8859-1 instead of UTF-8.

When using Tomcat 5.5 everything is displayed correctly as UTF-8. Java
Server Pages do _not_ show similar behaviour.

Has anyone experienced similar problems?

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Encoding in Tomcat 6

2007-05-21 Thread hkml

[EMAIL PROTECTED] wrote:
 html,body etc.) just some German special characters (ä ö ü).
sorry for that encoding problem, it should read ä ö ü. I first sent the
message using a different mail address. Then I got a response from the
list server, that I'm not allowed to send messages to this list and
afterwards I simply copied and pasted the text from the returned email.
The returned email already contained the wrong characters, because the
mailer daemon is answering without setting the encoding in the mail
header properly.

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Encoding in Tomcat 6

2007-05-21 Thread hkml

Markus Schönhaber wrote:
 Works fine for me.
Well, that is really a surprise for me. I tried this in 3 different 
operating systems and it was consequently wrong.


 You do call response#setContentType before response#getWriter, don't you?
 There's no filter changing things?
Well, the code is more or less trivial: the class extends HttpServlet 
and overwrites method doGet like this:


@Override
protected void doGet( HttpServletRequest request,
  HttpServletResponse response )
throws ServletException, IOException
{
response.setContentType( text/html;charset=UTF-8 );
response.setCharacterEncoding( utf-8 );

ServletOutputStream out = response.getOutputStream();
out.println( !DOCTYPE html PUBLIC \-//W3C//DTD XHTML 1.0 +
  Strict//EN\ \http://www.w3.org/TR/xhtml1; +
 /DTD/xhtml1-strict.dtd\ );
out.println( htmlhead );
out.println( meta http-equiv=\Content-Type\  +
 content=\text/html; charset=utf-8\ / );
out.println( /head );
out.println( body );
out.println( pJust an encoding test: ä ö ü Ä Ö Ü ß/p );
out.println( /body );
out.println( /html );
}

That's all.

 BTW: I consider LiveHTTPHeaders an incredibly useful Firefox extension
 when it comes to finding out which headers the server really sends.
Thanks for the hint, I just installed it. Nevertheless in this case the 
HTTP header must be innocent: Firefox is using encoding UTF-8 to show 
the page, which is absolutely correct. The problem is simply the fact, 
that the characters are encoded as ISO-8859-1 (probably by the response 
output stream).


Best regards,
hk

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Encoding in Tomcat 6

2007-05-21 Thread hkml

Markus Schönhaber wrote:

... ServletOutputStream is suitable for writing binary data in the
response as the docs say. If you want to transmit textual data, use
HttpServletResponse#getWriter() (see my question above).
yes, this really is a point, Georg's answer already pointed me to the 
right direction. Nevertheless I must say it's not obvious to me, what 
the meaning of 'writing binary data in the response' is. I would have 
expected that setting the response's character encoding plus writing to 
its output stream would get the encoding right. The java string can not 
be written as is (because its java's internal representation of a 
string) and the conversion to characters somewhere in the dark behind 
the response class could be done correctly, because I did set the encoding.


Well, nevertheless, it works with the getWriter() method as I already 
checked. Thank you very much for your help.


Regards,
hk

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Encoding in Tomcat 6

2007-05-21 Thread hkml

Georg Sauer-Limbach wrote:

the question is: How do you create the output of
the servlet, that is, with which Writer or OutputStream.

yes you're right: I simply used the output stream.


But if you just obtain the output byte stream of the servlet,
ie by calling

  OutputStream outputStream = response.getOutputStream();

and you use this stream to output character data, then the
call to response.setCharacterEncoding() is completely useless.
You're obviously write, I tried using getWriter() and things work as 
expected. But it is as I said in my last answer to Markus Schönhaber: I 
do not think it is very obvious, that the response class is writing the 
characters using the platform's default encoding in this case (the 
correct encoding is well known!).


Nevertheless I checked the javadoc and it correctly says, that 
ServletOutputStream is just for binary output (whatever the use of 
binary data in a website is).



Then it only counts what you do write to this stream yourself.
...

Hope this helps.

Yes, it really did. Thank you very much.

Heinz


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]