Ilic Aleksandar writes:
> Hi
>
> I am tring to convert some German specific character tosome other
> character from english. I am using enxt code
>
> String menu_text = "some text on german queried from MySQL";
>
> char repGer[] = {32,47,223,228,235,246,252};
> char repEng[] = {95,95,115,97,101,111,117};
>
> mmImg = menu_text.toLowerCase();
>
> for (int J = 0;J<repGer.length;J++) {
> mmImg = mmImg.replace(repGer[J],repEng[J]);
> }
Suggestion: You might want to make this easier to make sense of
by using the actual characters, for example �, is a lot easier to
grok than 223. Though that's not the problem.
Related to that, you may want to be real careful about how characters
are encoded. I'm guessing here that the data in the database is
Latin1 (or CP1252 which is "close" but windows specific). This may be
part of the problem.
>
>
> I put this code into JSP page. And it's working under Windows, but not
> under Linux. By the way I use thos code to make urls and I need english
> characters for that. Under Linux I am getting only question marks instead
> of every single german character. I am using Tomcat 4 and both systems.
> Versions of JDK and Tomcat are almost the same.
>
The question marks are the give away. I'm guessing here, but it feels
right. My suspicion is that the data has question marks instead of
German characters before you ever attempt to do the transformation.
That's usually the substation character when transcoding fails.
What's happening data is being extracted from the database in some
encoding. The JDBC driver is then transcoding it (to your default
encoding, or to an encoding you specify) and than storing that as
UCS2 (16 bit Unicode), which is what a Java character is made of.
Before going too deep, try just printing the German string and make
sure you have the data you think you have. If you don't and it
already has the question marks, here are a few ideas.
The JDK will pick up a default encoding from your Unix environment,
so checking what LANG is set to is a good idea. If you are in a
locale whose encoding is ASCII and not Latin1, the JVM will pick up on
that set the default encoding that way as well. Then it would be
quite reasonable for the JDBC driver to substitute '?'for characters
it thinks you can't display. Running
locale -a
will get you a list of known locales.
You'll want to check what the system property file.encoding is
set to, that's the JVM's default encoding unless you set another.
You can try something like this
import java.io.InputStreamReader;
import java.io.ByteArrayInputStream;
public class Encoding {
public static void main(String[] args){
byte[] bytes = new byte[0];
ByteArrayInputStream bs = new ByteArrayInputStream(bytes);
InputStreamReader in = new InputStreamReader(bs);
System.out.println("default encoding is " + in.getEncoding());
}
}
to see what the default encoding for your JDK in your environment is.
If the database you are using is not the same in both cases, then you
may want to check the encoding that the database is using. Here's a
pointer to the spot in the MySQL docs.
http://www.mysql.com/documentation/mysql/bychapter/manual_MySQL_Database_Administration.html#Localisation
You may also want to see the notes on how the JDBC driver handles
encodings. I'm guessing you are using mm.mysql. If so, a good start
is the unicode and encoding parameters to the connection
http://mmmysql.sourceforge.net/doc/mm.doc/c106.htm#AEN118
Hopefully some of that will help.
> What's more strange is that next code is working quite well under both
> systems:
>
> <%
>
> char repGer[] = {32,47,223,228,235,246,252};
> char repEng[] = {95,95,115,97,101,111,117};
>
> for (int i = 0; i < repGer.length; i++) {
>
>
> %>
> <tr>
> <td><%=repGer[i]%></td><td><%=repEng[i]%></td>
> </tr>
> <%
> }
> %>
I'm not surprised that works. In fact, that it works make the case
for the data already being mis-transcoded stronger.
>
--
Drew Sudell [EMAIL PROTECTED] http://www.op.net/~asudell
--
To unsubscribe: <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>