2009/1/1 André Warnier <a...@ice-sa.com>: > Hi. > > This has nothing specific to Tomcat, it's just a problem I'm having as a > non-java expert in modifying an exiting webapp. > I hope someone on this list can answer quickly, or send me to the > appropriate place to find out. I have tried to find, but get somewhat lost > in the Java docs. > > Problem : > an existing webapp reads from a socket connected to an external program. > The input stream is created as follows : > fromApp = socket.getInputStream(); > The read is as follows : > StringBuffer buf = new StringBuffer(2000); > int ic; > while((ic = fromApp.read()) != 26 && ic != -1) // hex 1A (SUB) > buf.append((char)ic); > > This is wrong, because it assumes that the input stream is always in an > 8-bit default platform encoding, which it isn't. > > How do I do this correctly, assuming that I do know that the incoming stream > is an 8-bit stream (like iso-8859-x), and I do know which 8-bit encoding is > being used (such as iso-8859-1 or iso-8859-2) ? > I cannot change the InputStream into something else, because there are a > zillion other places where this webapp tests on the read byte's value, > numerically. > > I mean, to append correctly to "buf" what was read in the "int", knowing > that the proper encoding (charset) of "fromApp" is "X", how do I write this > ? >
1. Using iso-8859-1 does not loose any information. That is, you can later print this out to iso-8859-1 stream, you will get exactly those 8-bit bytes of iso-8859-2 as were in input. If you need correctly Unicode, though, you can convert them by calling String.getBytes(encoding) and new String(bytes, encoding). new String(str.getBytes("ISO-8859-1"), "ISO-8859-2") 2. Well, the above, and all the others' tips I have read in this thread so far are the right ones. Those are what you should do when you are engineering and writing a well-made application. That is, you have to go with InputStreamReader, String, CharsetDecoder APIs and that will take care of various encodings, including multi-byte ones. In you case, when you are tailoring some oddly (bad) written specific application to your specific environment, and do not expect much, there is a simple approach: implement this conversion by using a lookup table. You will just need some static table of 256 chars and you are done. For example, package mypackage; import java.io.UnsupportedEncodingException; public class TranslationTable { private static char[] table; static { // "static initialization" block byte[] bytes = new byte[256]; for (int i=0; i<bytes.length; i++){ bytes[i] = (byte) i; } try { table = new String(bytes, "ISO-8859-2").toCharArray(); } catch (UnsupportedEncodingException ex) { ex.printStackTrace(); //System.exit(1); throw new Error("Class initialization failed", ex); } } public static char lookup(int i) { // will throw ArrayIndexOutOfBoundsException if i is -1, but that should be OK return table[i]; } } and replace > buf.append((char)ic); with buf.append(TranslationTable.lookup(ic)); Also, I would replace StringBuffer with StringBuilder, if you are running in Java 5 or later, but that is another story. Best regards, Konstantin Kolinko --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org