Theodore H. Smith scripsit: > I'm just curious about the \0 thing. What problems would having a \0 in > UTF-8 present, that are not presented by having \0 in ASCII? I can't > see any advantage there.
AFAICT it was a hack so that arbitrary Java strings could be encoded as C strings; that is, with no 0x00 bytes in them, even when the string contained a U+0000. This is the format used in Java class files for string constants as well. The important thing is to note that the readUTF and writeUTF methods are *binary* I/O; they are the standard way of serializing strings, just as the standard way of serializing ints is to write them out as a 4-byte big-endian sequence. They simply have nothing to do with character encoding at all. -- He made the Legislature meet at one-horse John Cowan tank-towns out in the alfalfa belt, so that [EMAIL PROTECTED] hardly nobody could get there and most of http://www.reutershealth.com the leaders would stay home and let him go http://www.ccil.org/~cowan to work and do things as he pleased. --Mencken, Declaration of Independence

