Hi all.
I'm developing a web application that uses textual data for Central&Eastern Europe.
The text is in a database which is internally UNICODE. I also have another DB instance
which is ISO-8859-2 encoded, so all options are in the play.
I thought I should set contentType="text/html; charset=ISO-8859-2", declare a page to
the same (just in case) and "sit back and enjoy myself". Unfortunately, I was wrong.
Not only is the Latin-2 support in both IE and Netscape buggy (they wouldn't display
"s-caron" and "z-caron", but would display Caps versions of those characters), but
Java is bugging me, too. Instead of letters specific to our alphabet, I'm getting "?".
With the help of a dedicated PostgreSQL JDBC developer, I have tracked this problem
down to JVM, which has a default encoding of "ISO-8859-1". In a standalone Java
application I can do explicit encoding, like this:
System.out.write( testString.getBytes( "ISO-8859-2" ) );
and it will print the characters I expect, instead of "?".
What do I do in Tomcat?
I have set contentType to "text/html; charset=ISO-8859-2" and in a generated Servlet
code it really has:
response.setContentType("text/html; charset=ISO-8859-2");
So, no trouble there. How do I get a (Unicode) string to convert to a ISO-8859-2
encoded byte stream? Because, eventually, that is what the browser should get. I
cannot use the method from above, since JspWriter doesn't accept byte[] as an
argument.
Nix.