> -----Original Message----- > From: Joe Tomcat [mailto:[EMAIL PROTECTED]] > Sent: Sonntag, 9. Februar 2003 12:47 > To: Tomcat Users List > Subject: Handling non-Latin chars in servlet, jdbc > > > Hello fellow Tomcatists, > > It is time for my web app to move beyond the confines of the A-B-Cs. > This app takes user input from web forms, stores it in > various fields in a database, and then displays it back in > various ways. The goal is to have it so that a user can > enter Japanese or other Asian language chars into the form in > his browser, the web app stores the form input in the db, and > later on, displays it back to the browser and the chars show > up the right way. > > It seems like this should be easy. Java is designed for > multibyte, and I think Postgres can also store multibyte > chars, but I'm running into a block. My friend in Japan > entered some chars into a form, and hit submit, and what was > stored in the db were html entities. Then, when he displayed > it back to his browser, it was a problem because my output > code automatically escapes html entities, so what he saw was > "&48832;" or something, instead of the ji he was expecting. > > Does anyone have some tips on this, or pointers to articles > or books I should be reading about how to do this? > First: Make sure that your generated html page has a content header that tells the browser what content encoding you want. Otherwise your browser might imply Latin for parsing even if you want Unicode. Things like: <%@ page contentType="text/html; charset=xyz" %> or <meta http-equiv="content-type" content="text/html; charset=xyz"> might help.
Second (from own bad experience ;-)): I use MySQL which also support Unicode. But you have to set the encoding you want MySQL to use. Otherwise it tries to find the encoding by checking the systems default. Ran into trouble because my development server had a German installation whereas my productive machine has an English setup. So I had "Latin" vs. "Latin-1". I wondered what happend to my German special characters, but actually the problem was that my JDBC driver talked the wrong encoding to the database and the problem was already located in my data access classes, not the jsp or html processing. After telling MySQL in the connection url to use Latin-1 encoding my problem was gone. So you should also check that your problem is not located on the database/driver side and your characters get garbled already there. Michael --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
